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Abstract 

We present a complete classification of all possible sets of classical reversible gates acting 
on bits, in terms of which reversible transformations they generate, assuming swaps and ancilla 
bits are available for free. Our classification can be seen as the reversible-computing analogue 
of Post’s lattice, a central result in mathematical logic from the 1940s. It is a step toward the 
ambitious goal of classifying all possible quantum gate sets acting on qubits. 

Our theorem implies a linear-time algorithm (which we have implemented), that takes as 
input the truth tables of reversible gates G and H, and that decides whether G generates H. 
Previously, this problem was not even known to be decidable (though with effort, one can derive 
from abstract considerations an algorithm that takes triply-exponential time). The theorem 
also implies that any n-bit reversible circuit can be “compressed” to an equivalent circuit, over 
the same gates, that uses at most 2" poly (n) gates and 0(1) ancilla bits; these are the first 
upper bounds on these quantities known, and are close to optimal. Finally, the theorem implies 
that every non-degenerate reversible gate can implement either every reversible transformation, 
or every affine transformation, when restricted to an “encoded subspace.” 

Briefly, the theorem says that every set of reversible gates generates either all reversible trans¬ 
formations on n-bit strings (as the Toffoli gate does); no transformations; all transformations 
that preserve Hamming weight (as the Fredkin gate does); all transformations that preserve 
Hamming weight mod k for some k; all affine transformations (as the Controlled-NOT gate 
does); all affine transformations that preserve Hamming weight mod 2 or mod 4, inner products 
mod 2, or a combination thereof; or a previous class augmented by a NOT or NOTNOT gate. 
Prior to this work, it was not even known that every class was finitely generated. Ruling out 
the possibility of additional classes, not in the list, requires some arguments about polynomials, 
lattices, and Diophantine equations. 
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1 Introduction 


The pervasiveness of universality —that is, the likelihood that a small number of simple operations 
already generate all operations in some relevant class—is one of the central phenomena in com¬ 
puter science. It appears, among other places, in the ability of simple logic gates to generate all 
Boolean functions (and of simple quantum gates to generate all unitary transformations); and in 
the simplicity of the rule sets that lead to Turing-universality, or to formal systems to which Godel’s 
theorems apply. Yet precisely because universality is so pervasive, it is often more interesting to 
understand the ways in which systems can fail to be universal. 

In 1941, the great logician Emil Post [22] published a complete classification of all the ways in 
which sets of Boolean logic gates can fail to be universal; for example, by being monotone (like the 
AND and OR gates) or by being affine over F 2 (like NOT and XOR). In universal algebra, closed 
classes of functions are known, somewhat opaquely, as clones, while the inclusion diagram of all 
Boolean clones is called Post’s lattice. Post’s lattice is surprisingly complicated, in part because 
Post did not assume that the constant functions 0 and 1 were available for free0 

This paper had its origin in our ambition to find the analogue of Post’s lattice for all possible sets 
of quantum gates acting on qubits. We view this as a large, important, and underappreciated goal: 
something that could be to quantum computing theory almost what the Classification of Finite 
Simple Groups was to group theory. To provide some context, there are many finite sets of 1-, 2- 
and 3-qubit quantum gates that are known to be universal—either in the strong sense that they 
can be used to approximate any n-qubit unitary transformation to any desired precision, or in the 
weaker sense that they suffice to perform universal quantum computation (possibly in an encoded 
subspace). To take two examples, Barenco et al. |5| showed universality for the CNOT gate plus 
the set of all 1-qubit gates, while Shi [26] showed universality for the Toffoli and Hadamard gates. 

There are also sets of quantum gates that are known not to be universal: for example, the basis¬ 
preserving gates, the 1-qubit gates, and most interestingly, the so-called stabilizer gates [II1[3] (that 
is, the CNOT, Hadamard, and 7r/4-Phase gates), as well as the stabilizer gates conjugated by 1- 
qubit unitary transformations. What is not known is whether the preceding list basically exhausts 
the ways in which quantum gates on qubits can fail to be universal. Are there other elegant 
discrete structures, analogous to the stabilizer gates, waiting to be discovered? Are there any gate 
sets, other than conjugated stabilizer gates, that might give rise to intermediate complexity classes, 
neither contained in P nor equal to BQP‘E How can we claim to understand quantum circuits—the 
bread-and-butter of quantum computing textbooks and introductory quantum computing courses— 
if we do not know the answers to such questions? 

Unfortunately, working out the full “quantum Post’s lattice” appears out of reach at present. 
This might surprise readers, given how much is known about particular quantum gate sets (e.g., 
those containing GNOT gates), but keep in mind that what is asked for is an accounting of all pos¬ 
sibilities, no matter how exotic. Indeed, even classifying 1- and 2-qubit quantum gate sets remains 
wide open (!), and seems, without a new idea, to require studying the irreducible representations 

^In Appendix 1121 we prove for completeness that if one does assume constants are free, then Post’s lattice dra¬ 
matically simplifies, with all non-universal gate sets either monotone or affine. 

^To clarify, there are many restricted models of quantum computing known that are plausibly “intermediate” in 
that sense, including BosonSampling [I], the one-clean-qubit model US], and log-depth quantum circuits [8]. However, 
with the exception of conjugated stabilizer gates, none of those models arises from simply considering which unitary 
transformations can be generated by some set of fc-qubit gates. They all involve non-standard initial states, building 
blocks other than qubits, or restrictions on how the gates can be composed. 
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of thousands of groups. Recently, Aaronson and Bouland [2] completed a much simpler task, the 
classification of 2-mode beamsplitters; that was already a complicated undertaking. 

1.1 Classical Reversible Gates 

So one might wonder: can we at least understand all the possible sets of classical reversible gates 
acting on bits, in terms of which reversible transformations they generate? This an obvious 
prerequisite to the quantum case, since every classical reversible gate is also a unitary quantum 
gate. But beyond that, the classical problem is extremely interesting in its own right, with (as 
it turns out) a rich algebraic and number-theoretic structure, and with many implications for 
reversible computing as a whole. 

The notion of reversible computing [iniiiHiiiTiEKiiiisa] arose from early work on the physics of 
computation, by such figures as Feynman, Bennett, Benioff, Landauer, Fredkin, Toffoli, and Lloyd. 
This community was interested in questions like: does universal computation inherently require 
the generation of entropy (say, in the form of waste heat)? Surprisingly, the theory of reversible 
computing showed that, in principle, the answer to this question is “no.” Deleting information 
unavoidably generates entropy, according to Landauer’s principle m, but deleting information is 
not necessary for universal computation. 

Formally, a reversible gate is just a permutation G : {0,1}^ —>■ {0,1}^ of the set of /c-bit strings, 
for some positive integer k. The most famous examples are: 

• the 2-bit CNOT (Controlled-NOT) gate, which flips the second bit if and only if the first bit 
is 1; 

• the 3-bit Toffoli gate, which flips the third bit if and only if the first two bits are both 1; 

• the 3-bit Fredkin gate, which swaps the second and third bits if and only if the first bit is 1. 

These three gates already illustrate some of the concepts that play important roles in this paper. 
The CNOT gate can be used to copy information in a reversible way, since it maps xO to xx; and also 
to compute arbitrary affine functions over the finite field F 2 . However, because CNOT is limited 
to affine transformations, it is not computationally universal. Indeed, in contrast to the situation 
with irreversible logic gates, one can show that no 2-bit classical reversible gate is computationally 
universal. The Toffoli gate is computationally universal, because (for example) it maps x, y, 1 to 
x,y,xy, thereby computing the NAND function. Moreover, Toffoli showed [28] —and we prove for 
completeness in Section [7.11 — that the Toffoli gate is universal in a stronger sense: it generates all 
possible reversible transformations F : {0,1}” —>■ {0,1}"" if one allows the use of ancilla bits, which 
must be returned to their initial states by the end. 

But perhaps the most interesting case is that of the Fredkin gate. Like the Toffoli gate, 
the Fredkin gate is computationally universal: for example, it maps x,y,0 to x,xy,xy, thereby 
computing the AND function. But the Fredkin gate is not universal in the stronger sense. The 
reason is that it is conservative: that is, it never changes the total Hamming weight of the input. Far 
from being just a technical issue, conservativity was regarded by Fredkin and the other reversible 
computing pioneers as a sort of discrete analogue of the conservation of energy—and indeed, it 
plays a central role in certain physical realizations of reversible computing (for example, billiard- 
ball models, in which the total number of billiard balls must be conserved). 
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However, all we have seen so far are three specific examples of reversible gates, each leading 
to a different behavior. To anyone with a mathematical mindset, the question remains: what 
are all the possible behaviors? For example: is Hamming weight the only possible “conserved 
quantity” in reversible computation? Are there other ways, besides being affine, to fail to be 
computationally universal? Can one derive, from first principles, why the classes of reversible 
transformations generated by CNOT, Fredkin, etc. are somehow special, rather than just pointing 
to the sociological fact that these are classes that people in the early 1980s happened to study? 


1.2 Ground Rules 


In this work, we achieve a complete classification of all possible sets of reversible gates acting on 
bits, in terms of which reversible transformations F : {0,1}” —)• {0,1}”" they generate. Before 
describing our result, let us carefully explain the ground rules. 

First, we assume that swapping bits is free. This simply means that we do not care how the 
input bits are labeled—or, if we imagine the bits carried by wires, then we can permute the wires 
in any way we like. The second rule is that an unlimited number of ancilla bits may be used, 
provided the ancilla bits are returned to their initial states by the end of the computation. This 
second rule might look unfamiliar, but in the context of reversible computing, it is the right choice. 

We need to allow ancilla bits because if we do not, then countless transformations are disallowed 
for trivial reasons. (Restricting a reversible circuit to use no ancillas is like restricting a Turing 
machine to use no memory, besides the n bits that are used to write down the inpnt.) We are forced 
to say that, although our gates might generate some reversible transformation F (x, 0) = {G (x), 0), 
they do not generate the smaller transformation G. The exact value of n then also takes on 
undeserved importance, as we need to worry about “small-n effects”: e.g., that a 3-bit gate cannot 
be applied to a 2-bit input. 

As for the number of ancilla bits: it will turn out, because of our classification theorem, that 
every reversible gate needs only 0(1) ancilla bit^ to generate every n-bit reversible transformation 
that it can generate at all. However, we do not wish to prejudge this question; if there had been 
reversible gates that could generate certain transformations, but only by using (say) 2^" ancilla bits, 
then that would have been fascinating to know. For the same reason, we do not wish prematurely 
to restrict the number of ancilla bits that can be 0, or the number that can be 1. 

On the other hand, the ancilla bits must be returned to their original states because if they 
are not, then the computation was not really reversible. One can then learn something about the 
computation by examining the ancilla bits—if nothing else, then the fact that the computation 
was done at all. The symmetry between input and output is broken; one cannot then run the 
computation backwards without setting the ancilla bits differently. This is not just a philosophical 
problem: if the ancilla bits carry away information about the input x, then entropy, or waste heat, 
has been leaked into the computer’s environment. Worse yet, if the reversible computation is a 
subroutine of a quantum computation, then the leaked entropy will cause decoherence, preventing 
the branches of the quantum superposition with different x values from interfering with each other, 
as is needed to obtain a quantum speedup. In reversible computing, the technical term for ancilla 
bits that still depend on x after a computation is complete is garhag^ 


®Since it is easy to show that a constant number of ancilla bits are sometimes needed (see Proposition |9]), this is 
the optimal answer, up to the value of the constant (which might depend on the gate set). 

^In Section and Appendix 1131 we will discuss a modified rule, which allows a reversible circuit to change the 

ancilla bits, as long as they change in a way that is independent of the input x. We will show that this “loose ancilla 
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1.3 Our Results 


Even after we assume that bit swaps and ancilla bits are free, it remains a significant undertaking 
to work out the complete list of reversible gate classes, and (especially!) to prove that the list is 
complete. Doing so is this paper’s main technical contribution. 

We give a formal statement of the classification theorem in Section [3l and we show the lattice 
of reversible gate classes in Figure [3j (In Appendix llll we also calculate the exact number of 3-bit 
gates that generate each class.) For now, let us simply state the main conclusions informally. 

(1) Conserved Quantities. The following is the complete list of the “global quantities” that 
reversible gate sets can conserve (if we restrict attention to non-degenerate gate sets, and 
ignore certain complications caused by linearity and affineness): Hamming weight, Hamming 
weight mod k for any k >2, and inner product mod 2 between pairs of inputs. 

(2) Anti-Conservation. There are gates, such as the NOT gate, that “anti-conserve” the 
Hamming weight mod 2 (i.e., always change it by a fixed nonzero amount). However, there 
are no analogues of these for any of the other conserved quantities. 

(3) Encoded Universality. In terms of their “computational power,” there are only three 
kinds of reversible gate sets: degenerate (e.g., NOTs, bit-swaps), non-degenerate but affine 
(e.g., CNOT), and non-affine (e.g., Toffoli, Fredkin). More interestingly, every non-affine 
gate set can implement every reversible transformation, and every non-degenerate affine gate 
set can implement every affine transformation, if the input and output bits are encoded by 
longer strings in a suitable way. For details about “encoded universality,” see Section 14.41 

(4) Sporadic Gate Sets. The conserved quantities interact with linearity and affineness in 
complicated ways, producing “sporadic” affine gate sets that we have classified. For example, 
non-degenerate affine gates can preserve Hamming weight mod k, but only if /c = 2 or A: = 4. 
All gates that preserve inner product mod 2 are linear, and all linear gates that preserve 
Hamming weight mod 4 also preserve inner product mod 2. As a further complication, affine 
gates can be orthogonal or mod-2-preserving or mod-4-preserving in their linear part, but not 
in their affine part. 

(5) Finite Generation. For each closed class of reversible transformations, there is a single 
gate that generates the entire class. {A priori, it is not even obvious that every class is finitely 
generated, or that there is “only” a countable infinity of classes!) For more, see Section [4.11 

(6) Symmetry. Every reversible gate set is symmetric under interchanging the roles of 0 and 
1. For more, see Section mi 

1.4 Algorithmic and Complexity Aspects 

Perhaps most relevant to theoretical computer scientists, our classification theorem leads to new 
algorithms and complexity results about reversible gates and circuits: results that follow easily 
from the classification, but that we have no idea how to prove otherwise. 

Let RevGen (Reversible Generation) be the following problem: we are given as input the truth 
tables of reversible gates Gi,, Gk, as well as of a target gate H, and wish to decide whether the 

rule” causes only a small change to our classification theorem. 
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Gi's generate H. Then we obtain a linear-time algorithm for RevGen. Here, of course, “linear” 
means linear in the sizes of the truth tables, which is n2” for an n-bit gate. However, if just a 
tiny amount of “summary data” about each gate G is provided—namely, the possible values of 
\G (x)| — |x|, where | j is the Hamming weight, as well as which affine transformation G performs if 
it is affine—then the algorithm actually runs in O {n^) time, where oj is the matrix multiplication 
exponent. 

We have implemented this algorithm; code is available for download at [23]. For more details 
see Section 1321 

Our classification theorem also implies the first general upper bounds (i.e., bounds that hold 
for all possible gate sets) on the number of gates and ancilla bits needed to implement reversible 
transformations. In particular, we show (see Section 14. 3p that if a set of reversible gates generates 
an n-bit transformation F at all, then it does so via a circuit with at most 2” poly (n) gates and 
0(1) ancilla bits. These bounds are close to optimal. 

By contrast, let us consider the situation for these problems without the classification theorem. 
Suppose, for example, that we want to know whether a reversible transformation H : {0,1}*^ ^ 
{0,1}” can be synthesized using gates Gi,..., Gk- If we knew some upper bound on the number 
of ancilla bits that might be needed by the generating circuit, then if nothing else, we could of 
course solve this problem by brute force. The trouble is that, without the classification, it is not 
obvious how to prove any upper bound on the number of ancillas—not even, say, Ackermann (n). 
This makes it unclear, a priori, whether RevGen is even decidable, never mind its complexity! 

One can show on abstract grounds that RevGen is decidable, but with an astronomical running 
time. To explain this requires a short digression. In universal algebra, there is a body of theory 
(see e.g. m), which grew out of Post’s original work [22], about the general problem of classifying 
closed classes of functions (clones) of various kinds. The upshot is that every clone is characterized 
by an invariant that all functions in the clone preserve: for example, affineness for the NOT and 
XOR functions, or monotonicity for the AND and OR functions. The clone can then be shown 
to contain all functions that preserve the invariant. (There is a formal definition of “invariant,” 
involving polymorphisms, which makes this statement not a tautology, but we omit it.) Alongside 
the lattice of clones of functions, there is a dual lattice of coclones of invariants, and there is a 
Galois connection relating the two: as one adds more functions, one preserves fewer invariants, and 
vice versa. 

In response to an inquiry by us, Emil Jefabek recently showed m that the clone/coclone 
duality can be adapted to the setting of reversible gates. This means that we know, even without 
a classification theorem, that every closed class of reversible transformations is uniquely determined 
by the invariants that it preserves. 

Unfortunately, this elegant characterization does not give rise to feasible algorithms. The 
reason is that, for an n-bit gate G : {0,1}” ^ {0,1}”", the invariants could in principle involve 
all 2” inputs, as well arbitrary polymorphisms mapping those inputs into a commutative monoid. 
Thus the number of polymorphisms one needs to consider grows at least like 2^ . Now, the word 
problem for commutative monoids is decidable, by reduction to the ideal membership problem (see, 
e.g., [Ml p. 55]). And by putting these facts together, one can derive an algorithm for RevGen 
that uses doubly-exponential space and triply-exponential time, as a function of the truth table 
sizes: in other words, exp (exp (exp (exp (n)))) time, as a function of n. We believe it should also 
be possible to extract exp (exp (exp (exp (n)))) upper bounds on the number of gates and ancillas 
from this algorithm, although we have not verified the details. 
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1.5 Proof Ideas 


We hope we have made the case that the classification theorem improves the complexity situation for 
reversible circuit synthesis! Even so, some people might regard classifying all possible reversible 
gate sets as a complicated, maybe worthwhile, but fundamentally tedious exercise. Can’t such 
problems be automated via computer search? On the contrary, there are specific aspects of 
reversible computation that make this classification problem both unusually rich, and unusually 
hard to reduce to any finite number of cases. 

We already discussed the astronomical number of possible invariants that even a tiny reversible 
gate (say, a 3-bit gate) might satisfy, and the hopelessness of enumerating them by brute force. 
However, even if we could cut down the number of invariants to something reasonable, there 
would still be the problem that the size, n, of a reversible gate can be arbitrarily large—and as 
one considers larger gates, one can discover more and more invariants. Indeed, that is precisely 
what happens in our case, since the Hamming weight mod k invariant can only be “noticed” by 
considering gates on k bits or more. There are also “sporadic” affine classes that can only be found 
by considering 6-bit gates. 

Of course, it is not hard just to guess a large number of reversible gate classes (affine transfor¬ 
mations, parity-preserving and parity-flipping transformations, etc.), prove that these classes are 
all distinct, and then prove that each one can be generated by a simple set of gates (e.g., CNOT or 
Fredkin-|-NOT). Also, once one has a sufficiently powerful gate (say, the CNOT gate), it is often 
straightforward to classify all the classes containing that gate. So for example, it is relatively easy 
to show that CNOT, together with any non-affine gate, generates all reversible transformations. 

As usual with classification problems, the hard part is to rule out exotic additional classes: most 
of the work, one might say, is not about what is there, but about what isn’t there. It is one thing 
to synthesize some random lOOO-bit reversible transformation using only Toffoli gates, but quite 
another to synthesize a Toffoli gate using only the random lOOO-bit transformation! 

Thinking about this brings to the fore the central issue: that in reversible computation, it is 
not enough to output some desired string F (x)] one needs to output nothing else besides F (x). 
And hence, for example, it does not suffice to look inside the random lOOO-bit reversible gate G, 
to show that it contains a NAND gate, which is computationally universal. Rather, one needs to 
deal with all of G’s outputs, and show that one can eliminate the undesired ones. 

The way we do that involves another characteristic property of reversible circuits: that they 
can have “global conserved quantities,” such as Hamming weight. Again and again, we need to 
prove that if a reversible gate G fails to conserve some quantity, such as the Hamming weight mod 
k, then that fact alone implies that we can use G to implement a desired behavior. This is where 
elementary algebra and number theory come in. 

There are two aspects to the problem. First, we need to understand something about the 
possible quantities that a reversible gate can conserve. For example, we will need the following 
three results: 

• No non-conservative reversible gate can conserve inner products mod k, unless k = 2. 

• No reversible gate can change Hamming weight mod A: by a fixed, nonzero amount, unless 
k = 2. 

• No nontrivial linear gate can conserve Hamming weight mod k, unless k = 2 or k = 4. 



We prove each of these statements in Section [6l using arguments based on complex polynomi¬ 
als. In Appendix 1151 we give alternative, more “combinatorial” proofs for the second and third 
statements. 

Next, using our knowledge about the possible conserved quantities, we need procedures that 
take any gate G that fails to conserve some quantity, and that use G to implement a desired 
behavior (say, making a single copy of a bit, or changing an inner product by exactly 1). We then 
leverage that behavior to generate a desired gate (say, a Fredkin gate). The two core tasks turn 
out to be the following: 

• Given any non-affine gate, we need to construct a Fredkin gate. We do this in Sections 18.31 
and 18.41 

• Given any non-orthogonal linear gate, we need to construct a CNOTNOT gate, a parity¬ 
preserving version of CNOT that maps x,y,z to x,y (B x, z (B x. We do this in Section 

[Ql 


In both of these cases, our solution involves 3-dimensional lattices: that is, subsets of closed 
under integer linear combinations. We argue, in essence, that the only possible obstruction to 
the desired behavior is a “modularity obstruction,” but the assumption about the gate G rules out 
such an obstruction. 

We can illustrate this with an example that ends iip not being needed in the final classification 
proof, but that we worked out earlier in this research]^ Let G be any gate that does not conserve 
(or anti-conserve) the Hamming weight mod k for any k > 2, and suppose we want to use G to 
construct a CNOT gate. 




Figure 1: Moving within first quadrant of lattice to construct a COPY gate 

Then we examine how G behaves on restricted inputs: in this case, on inputs that consist entirely 
of some number of copies of x and x, where x € {0,1} is a bit, as well as constant 0 and 1 bits. 

®In general, after completing the classification proof, we were able to go back and simplify it substantially, by 
removing results—for example, about the generation of CNOT gates—that were important for working out the lattice 
in the first place, but which then turned out to be subsumed (or which could be subsumed, with modest additional 
effort) by later parts of the classihcation. Our current proof reflects these simplihcations. 
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For example, perhaps G can increase the number of copies of x by 5 while decreasing the number 
of copies of X by 7, and can also decrease the number of copies of x by 6 without changing the 
number of copies of x. Whatever the case, the set of possible behaviors generates some lattice: in 
this case, a lattice in (see Figured]). We need to argue that the lattice contains a distinguished 
point encoding the desired “copying” behavior. In the case of the CNOT gate, the point is (1,0), 
since we want one more copy of x and no more copies of x. Showing that the lattice contains 
(1,0), in turn, boils down to arguing that a certain system of Diophantine linear equations must 
have a solution. One can do this, finally, by using the assumption that G does not conserve or 
anti-conserve the Hamming weight mod k for any k. 

To generate the Fredkin gate, we instead use the Chinese Remainder Theorem to combine gates 
that change the inner product mod p for various primes p into a gate that changes the inner product 
between two inputs by exactly 1; while to generate the CNOTNOT gate, we exploit the assumption 
that our generating gates are linear. In all these cases, it is crucial that we know, from Section [U 
that certain quantities cannot be conserved by any reversible gate. 

There are a few parts of the classification proof (for example, Section 19.41 on affine gate sets) 
that basically do come down to enumerating cases, but we hope to have given a sense for the 
interesting parts. 

1.6 Related Work 

Surprisingly, the general question of classifying reversible gates such as Toffoli and Fredkin appears 
never to have been asked, let alone answered, prior to this work. 

In the reversible computing literature, there are hundreds of papers on synthesizing reversible 
circuits (see [23| for a survey), but most of them focus on practical considerations: for example, 
trying to minimize the number of Toffoli gates or other measures of interest, often using software 
optimization tools. We found only a tiny amount of work relevant to the classification problem: 
notably, an unpublished preprint by Lloyd m, which shows that every non-affine reversible gate 
is computationally universal, if one does not care what garbage is generated in addition to the 
desired output. Lloyd’s result was subsequently rediscovered by Kerntopf et al. [I3] and De Vos 
and Storme |29j . We will reprove this result for completeness in Section 18.21 as we use it as one 
ingredient in our proof. 

There is also work by Morita et al. [21] that uses brute-force enumeration to classify certain 
reversible computing elements with 2, 3, or 4 wires, but the notion of “reversible gate” there is very 
different from the standard one (the gates are for routing a single “billiard ball” element rather than 
for transforming bit strings, and they have internal state). Finally, there is work by Strazdins [27] . 
not motivated by reversible computing, which considers classifying reversible Boolean functions, 
but which imposes a separate requirement on each output bit that it belong to one of the classes 
from Post’s original lattice, and which thereby misses all the reversible gates that conserve “global” 
quantities, such as the Fredkin gatel§ 

®Because of different rules regarding constants, developed with Post’s lattice rather than reversible computing in 
mind, Strazdins also includes classes that we do not (e.g., functions that always map 0" or 1** to themselves, but 
are otherwise arbitrary). To use our notation, his 13-class lattice ends up intersecting our infinite lattice in just five 
classes: (0), (NOT), (CNOTNOT, NOT), (CNOT), and (Toffoli). 
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2 Notation and Definitions 


F 2 means the field of 2 elements, [n] means ,n}. We denote by ei,... ,6^ the standard 

basis for the vector space F 2 : that is, ei = (1,0,... , 0), etc. 

Let X = xi... be an n-bit string. Then x means x with all n of its bits inverted. Also, x 0 y 
means bitwise XOR, x, y or xy means concatenation, x^ means the concatenation of k copies of x, 
and |x| means the Hamming weight. The parity of x is |x| mod 2. The inner product of x and y 
is the integer x ■ y = xiyi + ■ ■ ■ + XnPn- Note that 

x-{y(Bz) = x- y + x- z (mod 2), 

but the above need not hold if we are not working mod 2. 

By gar (x), we mean garbage depending on x: that is, “scratch work” that a reversible compu¬ 
tation generates along the way to computing some desired function / (x). Typically, the garbage 
later needs to be uncomputed. Uncomputing, a term introduced by Bennett [7], simply means 
running an entire computation in reverse, after the output / (x) has been safely stored. 

2.1 Gates 

By a (reversible) gate, throughout this paper we will mean a reversible transformation G on the 
set of k-hit strings: that is, a permutation of {0,1}^, for some fixed k. Formally, the terms 
‘gate’ and ‘reversible transformation’ will mean the same thing; ‘gate’ just connotes a reversible 
transformation that is particularly small or simple. 

A gate is nontrivial if it does something other than permute its input bits, and non-degenerate 
if it does something other than permute its input bits and/or apply NOT’s to some subset of them. 

A gate G is conservative if it satisfies |G (x)| = |x| for all x. A gate is mod-k-respecting if there 
exists a j such that 

\G (x)| = |x| + j (mod k) 

for all X. It’s mod-k-preserving if moreover j = 0. It’s mod-preserving if it’s mod-A;-preserving for 
some k >2, and mod-respecting if it’s mod-A:-respecting for some k > 2. 

As special cases, a mod-2-respecting gate is also called parity-respecting, a mod-2-preserving 
gate is called parity-preserving, and a gate G such that 

\G (x)| ^ |x| (mod2) 

for all X is called parity-flipping. In Theorem 1121 we will prove that parity-flipping gates are the 
only examples of mod-respecting gates that are not mod-preserving. 

The respecting number of a gate G, denoted k{G), is the largest k such that G is mod-/c- 
respecting. (By convention, if G is conservative then k (G) = 00 , while if G is non-mod-respecting 
then k{G) = 1.) We have the following fact: 

Proposition 1 G is mod-i-respecting if and only if i divides k (G). 

Proof. If i divides k[G), then certainly G is mod-Arespecting. Now, suppose G is mod-A 
respecting but I does not divide k (G). Then G is both mod-Grespecting and mod-A: (G)-respecting. 
So by the Chinese Remainder Theorem, G is mod-lcm (£,/c (G))-respecting. But this contradicts 
the definition of k (G). ■ 


11 


A gate G is affine if it implements an affine transformation over F 2 : that is, if there exists an 
invertible matrix A G and a vector b G F 2 , such that G (x) = Ax 0 b for all x. A gate is 

linear if moreover 6 = 0. A gate is orthogonal if it satisfies 

G (x) ■ G (y) = X ■ y (mod 2) 

for all x,y. (We will observe, in Lemma 1141 that every orthogonal gate is linear.) Also, if 
G (x) = Ax 0 6 is affine, then the linear part of G is the linear transformation G' (x) = Ax. We 
call G orthogonal in its linear part, mod-fe-preserving in its linear part, etc. if G' satisfies the 
corresponding invariant. A gate that is orthogonal in its linear part is also called an isometry. 

Given two gates G and H, their tensor product, G ® H, is a gate that applies G and H to 
disjoint sets of bits. We will often use the tensor product to produce a single gate that combines 
the properties of two previous gates. Also, we denote by G®* the tensor product of t copies of G. 

2.2 Gate Classes 

Let S = {Gi, G 2 ,...} be a set of gates, possibly on different numbers of bits and possibly infinite. 
Then (5) = (Gi,G 2 ,...), the class of reversible transformations generated by S, can be defined 
as the smallest set of reversible transformations F : {0,1}"" ^ {0,1}” that satisfies the following 
closure properties: 

(1) Base case. {S) contains S, as well as the identity function F (xi ... Xn) = xi... Xn for all 
n > 1. 

(2) Composition rule. If [S) contains F (xi... Xn) and G (xi... x„), then (S') also contains 
F(G(xi...Xn)). 

(3) Swapping rule. If (S) contains F {xi .. .Xn), then (S) also contains all possible functions 
a (F (x.,-(i) ... x.,-(„))) obtained by permuting F’s input and output bits. 

(4) Extension rule. If (S) contains F (xi... Xn), then (S) also contains the function 

G (xi... Xn, b) := (F (xi... Xn), b ), 
in which 6 occurs as a “dummy” bit. 

(5) Ancilla rule. If (S) contains a function F that satisfies 

F (xi... Xn, ai ... ak) = (G (xi... x„), oi ... a^) Vxi... Xn G {0,1}” , 

for some smaller function G and hxed “ancilla” string oi... G {0,1}^ that do not depend 
on X, then (S) also contains G. (Note that, if the afs are set to other values, then F need 
not have the above form.) 

Note that because of reversibility, the set of n-bit transformations in (S) (for any n) always forms 
a group. Indeed, if {S) contains F, then clearly (S) contains all the iterates F^ (x) = F {F (x)), 
etc. But since there must be some positive integer m such that F™ (x) = x, this means that 
F"*“^ (x) = F~^ (x). Thus, we do not need a separate rule stating that (S) is closed under 
inverses. 


12 


We say S generates the reversible transformation F F G (S). We also say that S generates 
(S). If (5) equals the set of all permutations of {0,1}"', for all n > 1, then we call S universal. 

Given an arbitrary set C of reversible transformations, we call C a reversible gate class (or class 
for short) if C is closed under rules (2)-(5) above: in other words, if there exists an S such that 
C = {S). 

A reversible circuit for the function F, over the gate set S, is an explicit procedure for generating 
F by applying gates in S, and thereby showing that F G (S). An example is shown in Figure [2j 
Reversible circuit diagrams are read from left to right, with each bit that occurs in the circuit (both 
input and ancilla bits) represented by a horizontal line, and each gate represented by a vertical line. 

If every gate G G S satisfies some invariant, then we can also describe S and (S) as satisfying 
that invariant. So for example, the set {CNOTNOT, NOT} is affine and parity-respecting, and so 
is the class that it generates. Conversely, S violates an invariant if any G G S violates it. 

Just as we dehned the respecting number k (G) of a gate, we would like to dehne the respecting 
number k (S') of an entire gate set. To do so, we need a proposition about the behavior of k (G) 
under tensor products. 
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Figure 2: Generating a Controlled-Controlled-Swap gate from Fredkin 


Proposition 2 For all gates G and H, 

k(G®H) = gcd (k(G),k(H)). 

Proof. Letting 7 = gcd (k (G ), k (H)), clearly G 0 H is mod- 7 -respecting. To see that G ® H 
is not mod-Grespecting for any £ > 7 : by definition, ^ must fail to divide either k{G) or k(H). 
Suppose it fails to divide k (G) without loss of generality. Then G cannot be mod-Grespecting, by 
Proposition dJ But if we consider pairs of inputs to G 0 H that differ only on G’s input, then this 
implies that G ^ F[ is not mod-Grespecting either. ■ 

If 5 = {Gi, G 2 ,...}, then because of Proposition^ we can define k (S) as gcd (k (Gi ), k (G 2 ) ,■■■)■ 
For then not only will every transformation in (S) be mod-A; (S')-respecting, but there will exist 
transformations in (S) that are not mod-Grespecting for any i > k (S). 

We then have that S is mod-/c-respecting if and only if k divides k(S), and mod-respecting if 
and only if S is mod-fe-respecting for some k > 2. 

2.3 Alternative Kinds of Generation 

We now discuss four alternative notions of what it can mean for a reversible gate set to “generate” 
a transformation. Besides being interesting in their own right, some of these notions will also be 
used in the proof of our main classification theorem. 

Partial Gates. A partial reversible gate is an injective function H ■. D ^ {0, l}*^, where D 
is some subset of {0,1}”". Such an H is consistent with a full reversible gate G if G (x) = H (x) 
whenever x G D. Also, we say that a reversible gate set S generates F[ if S generates any G with 
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which H is consistent. As an example, COPY is the 2-bit partial reversible gate defined by the 
following relations: 

COPY (00) = 00, COPY (10) = 11. 

If a gate set S can implement the above behavior, using ancilla bits that are returned to their 
original states by the end, then we say S “generates COPY”; the behavior on inputs 01 and 11 is 
irrelevant. Note that COPY is consistent with CNOT. One can think of COPY as a bargain- 
basement CNOT, but one that might be bootstrapped up to a full CNOT with further effort. 

Generation With Garbage. Let D C {0,1}”^, and H : D ^ {0, 1}” be some function, which 
need not be injective or surjective, or even have the same number of input and output bits. Then we 
say that a reversible gate set S generates H with garbage if there exists a reversible transformation 
G G (S'), as well as an ancilla string a and a function gar, such that G (x, a) = {H (x) , gar (x)) for 
all X G D. As an example, consider the ordinary 2-bit AND function, from {0,1}^ to {0,1}. Since 
AND destroys information, clearly no reversible gate can generate it in the usual sense, but many 
reversible gates can generate AND with garbage: for instance, the Toffoli and Fredkin gates, as we 
saw in Section 

Encoded Universality. This is a concept borrowed from quantum computing [1]. In our 
setting, encoded universality means that there is some way of encoding O’s and I’s by longer 
strings, such that our gate set can implement any desired transformation on the encoded bits. 
Note that, while this is a weaker notion of universality than the ability to generate arbitrary 
permutations of {0,1}”, it is stronger than “merely” computational universality, because it still 
requires a transformation to be performed reversibly, with no garbage left around. Formally, given 
a reversible gate set S, we say that S supports encoded universality if there are A;-bit strings a (0) 
and a (1) such that for every n-bit reversible transformation F (xi... Xn) = yi ■ ■ ■ yn, there exists 
a transformation G G (S) that satisfies 

G {a (xi) ...a (xn)) = a (yi) ...a (yn) 

for all X G {0,1}"". Also, we say that S supports affine encoded universality if this is true for every 
affine F. 

As a well-known example, the Fredkin gate is not universal in the usual sense, because it 
preserves Hamming weight. But it is easy to see that Fredkin supports encoded universality, 
using the so-called dual-rail encoding, in which every 0 bit is encoded as 01, and every 1 bit is 
encoded as 10. In Section 14.41 we will show, as a consequence of our classification theorem, that 
every reversible gate set (except for degenerate sets) supports either encoded universality or affine 
encoded universality. 

Loose Generation. Finally, we say that a gate set S loosely generates a reversible transfor¬ 
mation F : {0, iffi -G {0,1}”, if there exists a transformation G G (S), as well as ancilla strings a 
and b, such that 

G{x,a) = (F{x),b) 

for all X G {0,1}”. In other words, G is allowed to change the ancilla bits, so long as they change 
in a way that is independent of the input x. Under this rule, one could perhaps tell by examining 
the ancilla bits that G was applied, but one could not tell to which input. This suffices for some 
applications of reversible computing, though not for othersJll 

^For example, if G were applied to a quantum superposition, then it would still maintain coherence among all the 
inputs to which it was applied—though perhaps not between those inputs and other inputs in the superposition to 
which it was not applied. 
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3 Stating the Classification Theorem 


In this section we state our main result, and make a few preliminary remarks about it. First let 
us define the gates that appear in the classification theorem. 

• NOT is the 1-bit gate that maps x to x. 

• NOTNOT, or NOT®^, is the 2-bit gate that maps xy to xy. NOTNOT is a parity-preserving 
variant of NOT. 

• CNOT (Controlled-NOT) is the 2-bit gate that maps x,y io x,y ® x. CNOT is affine. 

• CNOTNOT is the 3-bit gate that maps x,y, z to x,y (B x, z (B x. CNOTNOT is affine and 
parity-preserving. 

• Toffoli (also called Controlled-Controlled-NOT, or CCNOT) is the 3-bit gate that maps x, y, z 
to x,y, z (B xy. 

• Fredkin (also called Controlled-SWAP, or CSWAP) is the 3-bit gate that maps x, y, z to 
x,y (B X {y (B z), z (B x {y (B z). In other words, it swaps y with z if x = 1, and does nothing 
if X = 0. Fredkin is conservative: it never changes the Hamming weight. 

• Cfc is a k-hit gate that maps 0^ to 1^ and to 0^, and all other k-hit strings to themselves. 
Cfc preserves the Hamming weight mod k. Note that Ci = NOT, while C 2 is equivalent to 
NOTNOT, up to a bit-swap. 

• Tfc is a /c-bit gate (for even k) that maps x to x if |x| is odd, or to x if |x| is even. A different 
definition is 

Tfc (xi . . . Xfc) = (xi © 6a,, . . . , Xfc © bx ), 

where bx := xi © • • • © x^. This shows that T^ is linear. Indeed, we also have 

Tfc (x) • Tfc (y) = X • y + (A: + 2) bxby = x • y (mod 2), 

which shows that T^, is orthogonal. Note also that, if k = 2 (mod 4), then T^ preserves 
Hamming weight mod 4; if |x| is even then |Tfc (x)| = |xl, while if |x| is odd then 

|Tfc (x)| = k — \x\ = 2 — \x\ = |x| (mod4) . 

• Ffc is a /c-bit gate (for even k) that maps x to x if |x| is even, or to x if |x| is odd. A different 
definition is 


Ffc (xi . . . Xfc) = Tfc (xi . . . Xfc) = (xi © 6a; © 1,..., Xfc © 6a, © 1) 

where bx is as above. This shows that F^, is affine. Indeed, if /c is a multiple of 4, then F^ 
preserves Hamming weight mod 4: if |x| is odd then jF^ (x)| = |x|, while if |x| is even then 

|Ffc (x)l = k — \x\ = lx| (mod4) . 

Since F^, is equal to T^ in its linear part, F^ is also an isometry. 
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We can now state the classification theorem. 


Theorem 3 (Main Result) Every set of reversible gates generates one of the following classes: 


1. The trivial class (which contains only bit-swaps). 

2. The class of all transformations (generated by TofFoli). 

3. The class of all conservative transformations (generated 6?/Fredkin). 

4- For each k > 3, the class of all mod-k-preserving transformations (generated by Ck). 

5. The class of all affine transformations (generated by CNOTj. 

6 . The class of all parity-preserving affine transformations (generated by CNOTNOT). 

7. The class of all mod-A-preserving affine transformations (generated by ¥ 4 ^). 

8 . The class of all orthogonal linear transformations (generated by T 4 ). 

9. The class of all mod-A-preserving orthogonal linear transformations (generated by Tg). 

10. Classes 1, 3, 7, 8 , or 9 augmented by a NOTNOT gate (note: 7 and 8 become equivalent this 
way). 

11. Classes 1, 3, 6 , 1, 8 , or 9 augmented by a NOT gate (note: 7 and 8 become equivalent this 
way). 


Furthermore, all the above classes are distinct except when noted otherwise, and they fit together 
in the lattice diagram shown in Figure OH 

Let us make some comments about the structure of the lattice. The lattice has a countably 
infinite number of classes, with the one infinite part given by the mod-fc-preserving classes. The 
mod-fc-preserving classes are partially ordered by divisibility, which means, for example, that the 
lattice is not planarH While there are infinite descending chains in the lattice, there is no infinite 
ascending chain. This means that, if we start from some reversible gate class and then add new 
gates that extend its power, we must terminate after finitely many steps with the class of all 
reversible transformations. 

In Appendix IlSl we will prove that if we allow loose generation, then the only change to Theorem 
[ 3 ] is that every C + NOTNOT class collapses with the corresponding C + NOT class. 

®Let us mention that Fredkin + NOTNOT generates the class of all parity-preserving transformations, while 
Fredkin -|- NOT generates the class of all parity-respecting transformations. We could have listed the parity-preserving 
transformations as a special case of the mod-fc-preserving transformations: namely, the case k — 2. If we had done 
so, though, we would have had to include the caveat that Ck only generates all mod-fc-preserving transformations 
when fc > 3 (when fc = 2, we also need Fredkin in the generating set). And in any case, the parity-respecting class 
would still need to be listed separately. 

®For consider the graph with the integers 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 15, 18, 20, 21, 24, and 28 as its vertices, 
and with an edge between each pair whose ratio is a prime. One can check that this graph contains 1 ^ 3,3 as a minor. 
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Figure 3: The inclusion lattice of reversible gate classes 
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4 Consequences of the Classification 


To illustrate the power of the classification theorem, in this section we use it to prove four general 
implications for reversible computation. While these implications are easy to prove with the 
classification in hand, we do not know how to prove any of them without it. 

4.1 Nature of the Classes 

Here is one immediate (though already non-obvious) corollary of Theorem [3l 

Corollary 4 Every reversible gate class C is finitely generated: that is, there exists a finite set S 
such that C = (S). 

Indeed, we have something stronger. 

Corollary 5 Every reversible gate class C is generated by a single gate G & C. 

Proof. This is immediate for all the classes listed in Theorem [3l except the ones involving NOT 
or NOTNOT gates. For classes of the form C = (G, NOT) or C = (G, NOTNOT), we just need a 
single gate G' that is clearly generated by C, and clearly not generated by a smaller class. We can 
then appeal to Theorem[3]to assert that G' must generate C. For each of the relevant G’s—namely, 
Fredkin, CNOTNOT, F 4 , and Tg—one such G' is the tensor product, G( 8 iNOT or G ( 8 * NOTNOT. 
■ 

We also wish to point out a non-obvious symmetry property that follows from the classification 
theorem. Given an n-bit reversible transformation F, let F*, or the dual of F, be F* {x\ ... Xn) '■= 
F (xi... Xn)- The dual can be thought of as F with the roles of 0 and 1 interchanged: for example, 
Toffoli* {xyz) flips 2 : if and only if x = y = 0. Also, call a gate F self-dual if F* = F, and call a 
reversible gate class C dual-closed if F* G C whenever F ^ C. Then: 

Corollary 6 Every reversible gate class C is dual-closed. 

Proof. This is obvious for all the classes listed in Theorem[3]that include a NOT or NOTNOT gate. 
For the others, we simply need to consider the classes one by one: the notions of “conservative,” 
“mod-A:-respecting,” and “mod-fe-preserving” are manifestly the same after we interchange 0 and 1 . 
This is less manifest for the notion of “orthogonal,” but one can check that T^ and F^ are self-dual 
for all even k. ■ 

4.2 Linear-Time Algorithm 

If one wanted, one could interpret this entire paper as addressing a straightforward algorithms 
problem: namely, the RevGen problem defined in Section 11.41 where we are given as input a set of 
reversible gates Gi,, Gk, as well as a target reversible transformation H, and we want to know 
whether the Gfs generate H. From that perspective, our contribution is to reduce the known 
upper bound on the complexity of RevGen: from recursively-enumerable (!), or triply-exponential 
time if we use Jefabek’s recent clone/coclone duality for reversible gates [12], all the way down to 
linear time. 

Theorem 7 There is a linear-time algorithm for RevGen. 
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Proof. It suffices to give a linear-time algorithm that takes as input the truth table of a single 
reversible transformation G : {0,1}” ^ {0,1}”, and that decides which class it generates. For we 
can then compute {Gi, ..., Gk) by taking the least upper bound of (Gi),..., {Gk), and can also 
solve the membership problem by checking whether 

{Gi,...,Gk) = {Gi,...,Gk,H). 

The algorithm is as follows: first, make a single pass through G’s truth table, in order to answer 
the following two questions. 

• Is G affine, and if so, what is its matrix representation, G (x) = Ax © 5? 

• What is W (G) := {|G (x)| -\x\:xe {0,1}*"}? 

In any reasonable RAM model, both questions can easily be answered in O {n2^) time, which 
is the number of bits in G’s truth table. 

If G is non-affine, then Theorem [3] implies that we can determine (G) from W (G) alone. If G is 
affine, then Theorem [3] implies we can determine (G) from [A, b) alone, though it is also convenient 
to use W (G). We need to take the gcd of the numbers in W (G), check whether A is orthogonal, 
etc., but the time needed for these operations is only poly (n), which is negligible compared to the 
input size of n 2 "’. ■ 

We have implemented the algorithm described in Theorem [71 and Java code is available for 
download [ 2 i] . 

4.3 Compression of Reversible Circuits 

We now state a “complexity-theoretic” consequence of Theorem |3J 

Theorem 8 Let R be a reversible cireuit, over any gate set S, that maps {0, !}"■ to {0, !}"■, using 
an unlimited number of gates and ancilla bits. Then there is another reversible circuit, over the 
same gate set S, that applies the same transformation as R does, and that uses only 2^ poly(n) 
gates and 0 ( 1 ) ancilla 

Proof. If S is one of the gate sets listed in Theorem |3l then this follows immediately by examining 
the reversible circuit constructions in Section [71 for each class in the classification. Building, in 
relevant parts, on results by others 1251E], we will take care in Section [7] to ensure that each non- 
affine circuit construction uses at most 2 ’^poly(n) gates and 0(1) ancilla bits, while each affine 
construction uses at most O(n^) gates and 0(1) ancilla bits (most actually use no ancilla bits). 

Now suppose S is not one of the sets listed in Theorem [3l but some other set that generates 
one of the listed classes. So for example, suppose (S) = (Fredkin,NOT). Even then, we know 
that S generates Fredkin and NOT, and the number of gates and ancillas needed to do so is just 
some constant, independent of n. Furthermore, each time we need a Fredkin or NOT, we can reuse 
the same ancilla bits, by the assumption that those bits are returned to their original states. So 
we can simply simulate the appropriate circuit construction from Section [71 using only a constant 
factor more gates and O (1) more ancilla bits than the original construction. ■ 

^°Here the big-O’s suppress constant factors that depend on the gate set in question. 
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As we said in Section [1.41 without the classification theorem, it is not obvious how to prove any 
upper bound whatsoever on the number of gates or ancillas, for arbitrary gate sets S. Of course, 
any circuit that uses T gates also uses at most O (T) ancillas; and conversely, any circuit that uses 
M ancillas needs at most (2”"*“^)! gates, for counting reasons. But the best upper bounds on 
either quantity that follow from clone theory and the ideal membership problem appear to have 
the form exp (exp (exp (exp (n)))). 

A constant number of ancilla bits is sometimes needed, and not only for the trivial reasons that 
our gates might act on more than n bits, or only (e.g.) be able to map 0”' to 0” if no ancillas are 
available. 

Proposition 9 (Toffoli |28j) If no ancillas are allowed, then there exist reversible transforma¬ 
tions of { 0 , 1 }*^ that cannot be generated by any sequence of reversible gates on n — 1 bits or fewer. 

Proof. For all fc > 1, any (re — fc)-bit gate induces an even permutation of {0,1}"'—since each 
cycle is repeated 2^ times, once for every setting of the k bits on which the gate doesn’t act. But 
there are also odd permutations of {0,1}". ■ 

It is also easy to show, using a Shannon counting argument, that there exist re-bit reversible 
transformations that require 14 (2”) gates to implement, and re-bit affine transformations that re¬ 
quire Q (re^/logre) gates. Thus the bounds in Theorem [8] on the number of gates T are, for each 
class, off from the optimal bounds only by polylog T factors. 

4.4 Encoded Universality 

If we only care about which Boolean functions / : {0,1}” —>• {0,1} can be computed, and are 
completely uninterested in what garbage is output along with /, then it is not hard to see that 
all reversible gate sets fall into three classes. Namely, non-affine gate sets (such as Toffoli and 
Fredkin) can compute all Boolean functions jli] non-degenerate affine gate sets (such as CNOT 
and CNOTNOT) can compute all affine functions; and degenerate gate sets (such as NOT and 
NOTNOT) can compute only 1-bit functions. However, the classification theorem lets us make 
a more interesting statement. Recall the notion of encoded universality from Section 12.31 which 
demands that every reversible transformation (or every affine transformation) be implementable 
without garbage, once 0 and 1 are “encoded” by longer strings a (0) and a(l) respectively. 

Theorem 10 Besides the trivial, NOT, and NOTNOT classes, every reversible gate class supports 
encoded universality if non-affine, or affine encoded universality if affine. 

Proof. For (Fredkin), and for all the non-affine classes above (Fredkin), we use the so-called “dual¬ 
rail encoding,” where 0 is encoded by 01 and 1 is encoded by 10. Given three encoded bits, xxyyzz, 
we can simulate a Fredkin gate by applying one Fredkin to xyz and another to xyz, and can also 
simulate a CNOT by applying a Fredkin to xyy. But Fredkin-|-CNOT generates everything. 

The dual-rail encoding also works for simulating all affine transformations using an F4 gate. 
For note that 


F 4 {xyyl) = {l,x®y,x®y, x) 

= {x,x®y,x®y,l), 

^^This was proven by Lloyd [19], as well as by Kerntopf et al. [13] and De Vos and Storme [29] : we include a proof 
for completeness in Section IQ 
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where we used that we can permute bits for free. So given two encoded bits, xxyy, we can simulate 
a CNOT from x to y by applying F 4 to x, y, y, and one ancilla bit initialized to 1. 

For (CNOTNOT), we use a repetition encoding, where 0 is encoded by 00 and 1 is encoded by 
11. Given two encoded bits, xxyy, we can simulate a CNOT from x to y by applying a CNOTNOT 
from either copy of x to both copies of y. This lets us perform all affine transformations on the 
encoded subspace. 

The repetition encoding also works for (T 4 ). For notice that 

T 4 (xyyO) = (0, X © y, X © y, x) 

= (x,x © y,x © y, 0 ). 

Thus, to simulate a CNOT from x to y, we use one copy of x, both copies of y, and one ancilla bit 
initialized to 0 . 

Finally, for (Tg), we encode 0 by 0011 and 1 by 1100. Notice that 

Tg (xyyyyO) = ( 0 , x © y, x © y, x © y, x © y, x) 

= (x, X © y, X © y, X © y, x © y, 0) . 

So given two encoded bits, xxxxyyyy, we can simulate a CNOT from x to y by using one copy of 
X, all four copies of y and y, and one ancilla bit initialized to 0. ■ 

In the proof of Theorem 1101 notice that, every time we simulated Fredkin (xyz) or CNOT (xy), 
we had to examine only a single bit in the encoding of the control bit x. Thus, Theorem [TOl actually 
yields a stronger consequence: that given an ordinary, unencoded input string xi ... x^, we can use 
any non-degenerate reversible gate first to translate x into its encoded version a (xi)... a (x^), and 
then to perform arbitrary transformations or affine transformations on the encoding. 

5 Structure of the Proof 

The proof of Theorem [3] naturally divides into four components. First, we need to verify that 
all the gates mentioned in the theorem really do satisfy the invariants that they are claimed to 
satisfy—and as a consequence, that any reversible transformation they generate also satisfies the 
invariants. This is completely routine. 

Second, we need to verify that all pairs of classes that Theorem [3] says are distinct, are distinct. 
We handle this in Theorem m below (there are only a few non-obvious cases). 

Third, we need to verify that the “gate definition” of each class coincides with its “invariant 
definition”—i.e., that each gate really does generate all reversible transformations that satisfy 
its associated invariant. For example, we need to show that Fredkin generates all conservative 
transformations, that C^ generates all transformations that preserve Hamming weight mod k, and 
that T 4 generates all orthogonal linear transformations. Many of these results are already known, 
but for completeness, we prove all of them in Section [71 by giving explicit constructions of reversible 
circuits El 

^^The upshot of the Galois connection for clones m is that, if we could prove that a list of invariants for a given 
gate set S was the complete list of invariants satisfied by S, then this second part of the proof would be unnecessary: 
it would follow automatically that S generates all reversible transformations that satisfy the invariants. But this 
begs the question: how do we prove that a list of invariants for S is complete? In each case, the easiest way we 
could find to do this, was just by explicitly describing circuits of S'-gates to generate all transformations that satisfy 
the stated invariants. 
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Finally, we need to show that there are no additional reversible gate classes, besides the ones 
listed in Theorem [3l This is by far the most interesting part, and occupies the majority of the 
paper. The organization is as follows: 

• In Section [H we collect numerous results about what reversible transformations can and 
cannot do to Hamming weights mod k and inner products mod k, in both the affine and the 
non-affine cases; these results are then drawn on in the rest of the paper. (Some of them are 
even used for the circuit constructions in Section [3) 

• In Section [SI we complete the classification of all non-affine gate sets. In Section [ 8 .11 we show 
that the only classes that contain a Fredkin gate are (Fredkin) itself, (Fredkin, NOTNOT), 
(Fredkin, NOT), (Cfc) for k > 3, and (Toffoli). Next, in Section [831 we show that every 
nontrivial conservative gate generates Fredkin. Then, in Section 18.41 we build on the result 
of Section [8.41 to show that every non-affine gate set generates Fredkin. 

• In Section [9l we complete the classification of all affine gate sets. For simplicity, we start 
with linear gate sets only. In Section 19. 1[ we show that every nontrivial mod-4-preserving 
linear gate generates Tg, and that every nontrivial, non-mod-4-preserving orthogonal gate 
generates T 4 . Next, in Section [9.21 we show that every non-orthogonal linear gate generates 
CNOTNOT. Then, in Section [9.31 we show that every non-parity-preserving linear gate gen¬ 
erates CNOT. Since CNOT generates all linear transformations, completes the classification 
of linear gate sets. Finally, in Section [9.41 we “put back the affine part,” showing that it can 
lead to only 8 additional classes besides the linear classes ( 0 ), (Tg), (T 4 ), (CNOTNOT), and 
(CNOT). 

Theorem 11 All pairs of classes asserted to he distinct by Theorem\^ are distinct. 

Proof. In each case, one just needs to observe that the gate that generates a given class A, satisfies 
some invariant violated by the gate that generates another class B. (Here we are using the “gate 
definitions” of the classes, which will be proven equivalent to the invariant definitions in Section 
El) So for example, (Fredkin) cannot contain CNOT because Fredkin is conservative; conversely, 
(CNOT) cannot contain Fredkin because CNOT is affine. 

The only tricky classes are those involving NOT and NOTNOT gates: indeed, these classes do 
sometimes coincide, as noted in Theorem [3l However, in all cases where the classes are distinct, 
their distinctness is witnessed by the following invariants: 

• (Fredkin, NOT) and (Fredkin, NOTNOT) are conservative in their linear part. 

• (CNOTNOT, NOT) is parity-preserving in its linear part. 

• (F 4 ,N 0 T) = (T 4 ,N 0 T) and (F 4 , NOTNOT) = (T 4 , NOTNOT) are orthogonal in their linear 
part (isometries). 

• (TgjNOT) and (Tg, NOTNOT) are orthogonal and mod-4-preserving in their linear part. 

As a final remark, even if a reversible transformation is implemented with the help of ancilla 
bits, as long as the ancilla bits start and end in the same state ai... a^, they have no effect on any 
of the invariants discussed above, and for that reason are irrelevant. ■ 
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6 Hamming Weights and Inner Products 

The purpose of this section is to coiiect various mathematicai resuits about what a reversibie 
transformation G : {0,1}” —{0,1}”' can and cannot do to the Hamming weight of its input, or to 
the inner product of two inputs. That is, we study the possibie reiationships that can hoid between 
|x| and |G'(x)|, or between x ■ y and G{x) ■ G {y) (especiaiiy moduio various positive integers k). 
Not oniy are these resuits used heaviiy in the rest of the ciassification, but some of them might be 
of independent interest. 

6.1 Ruling Out Mod-Shifters 

Caii a reversibie transformation a mod-shifter if it aiways shifts the Hamming weight mod k of its 
input string by some fixed, nonzero amount. When k = 2, cieariy mod-shifters exist: indeed, the 
humbie NOT gate satisfies |NOT (x)| = |x| -|- 1 (mod2) for aii x G {0,1}, and iikewise for any other 
parity-flipping gate. However, we now show that k = 2 is the only possibility: mod-shifters do not 
exist for any larger k. 

Theorem 12 There are no mod-shifters for k >3. In other words: let G be a reversible transfor¬ 
mation on n-bit strings, and suppose 

|G (x)| = |x| -|- j (mod k) 

for all x G {0,1}”. Then either j = 0 or k = 2. 

Proof. Suppose the above equation holds for all x. Then introducing a new complex variable z, 
we have 

z\G{x)\ = ^\x\+j 

(since working mod — 1 is equivalent to setting z^ = 1). Since the above is true for all x, 

^|G(x)| = (^mod(z^-l)) . ( 1 ) 

xe{o,i}" xe{o,i}" 

By reversibility, we have 

^ ^\G{x)\^ Y z\^\ = {z + lT. 

a;G{0,ir xe{0,ir 

Therefore equation ([T|) simplifies to 

{z -\-1 )"^ (yZ^ — l) = 0 ^mod “ ^) ) ■ 

Now, since z^—\ has no repeated roots, it can divide {z -\- 1)^ i^z^ — l) only if it divides (z -j- 1) (z^ — l). 
For this we need either j = 0, causing — 1 = 0, or else j = k — 1 (from degree considerations). 
But it is easily checked that the equality 

z^-l = (z + l) - l) 

holds only if k = 2. ■ 

In Appendix [151 we provide an alternative proof of Theorem using linear algebra. The 
alternative proof is longer, but perhaps less mysterious. 
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6.2 Inner Products Mod k 

We have seen that there exist orthogonal gates (such as the T^. gates), which preserve inner products 
mod 2. In this section, we first show that no reversible gate that changes Hamming weights can 
preserve inner products mod k for any k > 3. We then observe that, if a reversible gate is 
orthogonal, then it must be linear, and we give necessary and conditions for orthogonality. 

Theorem 13 Let G he a non-conservative n-bit reversible gate, and suppose 

G {x) ■ G (y) = X ■ y (mod k) 


for all X, y G {0,1}”. Then k = 2. 


Proof. As in the proof of Theorem W2[ we promote the congruence to a congruence over complex 
polynomials: 

^G{xyG{y) ^ ^x-y ^ 

Fix a string x G {0,1}”' such that |G(x)| > |x|, which must exist because G is non-conservative. 
Then sum the congruence over all y: 


^GixpGiy) _ ^ ^x-y /k _ ^ 


y€{o,ir 

The summation on the right simplifies as follows. 


E 

?/e{o,ir 


z^-y = 


E n 

yG{o,ir *=i 






i=l 


Similarly, 


zG(x)-G(y) = (1 -I- 

3/e{o,ir 

since summing over all y is the same as summing over all G (y). So we have 


(1 + = (1 + z)l^l (mod - l)) > 

0 = (1 + ^)l^l 2 ”-|GWI - (1 + 2 )|GWI-kl^ (jnod - l)) > 

or equivalently, letting 

p{x) := - (1 + , 

we find that z^ — 1 divides (1 -t- z)^^^p (x) as a polynomial. Now, the roots of — 1 lie on the unit 
circle centered at 0. Meanwhile, the roots of p (x) lie on the circle in the complex plane of radius 
2, centered at —1. The only point of intersection of these two circles is z = 1, so that is the only 
root of 2 ;^ — 1 that can be covered by p (x). On the other hand, clearly 2 ; = — 1 is the only root of 
(1 -I- 2 :)I*L Hence, the only roots of 2 ;^ — 1 are 1 and —1, so we conclude that k = 2. ■ 

We now study reversible transformations that preserve inner products mod 2. 


Lemma 14 Every orthogonal gate G is linear. 
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Proof. Suppose 
Then for all x, y, z, 


G {x) ■ G {y) = X ■ y (mod 2). 


G {x ®y) ■ G {z) = {x ®y) ■ z 
= X ■ z + y ■ z 

= G{x)-G{z) + G{y)-G{z) 

= {G (x) © G (y)) ■ G (z) (mod 2). 

But if the above holds for all possible 2 , then 

G (x (B y) = G (x) (B G {y) (mod 2 ). 


Theorem M and Lemma [m have the following corollary. 

Corollary 15 Let G be any non-conservative, nonlinear gate. Then for all k > 2, there exist 
inputs X, y such that 

G (x) ■ G {y) ^ X ■ y (mod k). 


Also: 

Lemma 16 A linear transformation G{x) = Ax is orthogonal if and only if A is the identity: 
that is, if A’s column vectors satisfy \vi\ = 1 (mod 2 ) for all i and Vi ■ Vj = 0 (mod 2 ) for all i j. 

Proof. This is just the standard characterization of orthogonal matrices; that we are working over 
F 2 is irrelevant. First, if G preserves inner products mod 2 then for all i ^ j, 

l = ei-ei = (Aei) ■ (Aa) = \vi\ (mod 2 ), 

0 = Ci ■ Cj = (Aci) ■ (Acj) = Vi ■ Vj (mod 2 ). 

Second, if G satisfies the conditions then 

Ax ■ Ay = {Ax)'^Ay = x^{A^A)y = x'^y = x • y (mod 2 ) . 


6.3 Why Mod 2 and Mod 4 Are Special 

Recall that A denotes bitwise AND. We first need an “inclusion/exclusion formula” for the Ham¬ 
ming weight of a bitwise sum of strings. 

Lemma 17 For all vi,... ,vt G {0,1}”, we have 


ui © • • • © Xt 


(-2)1®!-! /\v, 

0CSC[t] igS 
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Proof. It suffices to prove the lemma for n = 1, since in the general case we are just summing over 
all i G [re]. Thus, assume without loss of generality that vi = ■ ■ ■ = vt = 1- Our problem then 
reduces to proving the following identity: 

Jo if t is even 
J 1 if t is odd, 

which follows straightforwardly from the binomial theorem. ■ 

Lemma 18 No nontrivial affine gate G is conservative. 

Proof. Let G {x) = Ax(Bb-, then |G(0"')| = |0"| = 0 implies b = O"". Likewise, |G(ej)| = \ei\ = 1 
for all i implies that ^ is a permutation matrix. But then G is trivial. ■ 

Theorem 19 If G is a nontrivial linear gate that preserves Hamming weight mod k, then either 
k = 2 or k = A. 

Proof. For all x, ?/, we have 

|3;| + |y| - 2(x • y) = |x© y| 

= |G(x© y)| 

= |G(x)©G(y)| 

^\G{x)\ + \G{y)\-2{G{x)-G{y)) 

= + I 2 /I - 2 {G (x) • G (y)) (mod k ), 

where the first and fourth lines used Lemma EZl the second and fifth lines used that G is mod-A;- 
preserving, and the third line used linearity. Hence 

2{x ■ y) = 2{G (x) ■ G (y)) (mod k). (2) 

If k is odd, then equation ([2]) implies 

X ■ y = G (x) • G (y) (mod k). 

But since G is nontrivial and linear. Lemma [l8] says that G is non-conservative. So by Theorem 
m the above equation cannot be satisfied for any odd k > 3. Likewise, if k is even, then Q 
implies 

X • y = G (x) • G (y) ^mod . 

Again by Theorem fT^ the above can be satisfied only if A: = 2 or fc = 4. ■ 

In Appendix [151 we provide an alternative proof of Theorem [T^ one that does not rely on 
Theorem [T3l 

Theorem 20 Let be an orthonormal basis oxer F 2 . An affine transformation F{x) = Ax©6 

is mod-A-preserving if and only if\b\ = 0 (mod4), and the vectors Vi := Aoi satisfy \vi \ + 2 {vi ■ b) = 
\oi\ (mod 4) for all i and vi • Uj = 0 (mod 2) for all i j. 
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Proof. First, if F is mod-4-preserving, then 

0 = |F( 0")1 = 1 ^ 0 *" © 6 | = 16| (mod4), 

and hence 

\oi\ = \F (oi)l = \Aoi © 6 | = \vi © 61 = |?;j| + | 6 | - 2 {vi ■ b) = \vi\ + 2 {vi ■ b) (mod4) 
for all i, and hence 

|oi + Oj\ = \F {oi © Oj)l = \vi © Vj © 61 = l^il + \vj\ + 161-2 {vi ■ Vj) — 2{vi-b) -2 {vj • 6) + 4 luj A Vj A 61 
= + l^il + 2 (fj • Vj) + 2{vi-b) + 2 {vj ■ 6) (mod 4) 

= loil + \oj\ +2{vi- Vj) (mod4) 

for all i j, from which we conclude that Vi ■ Vj = 0 (mod 2 ). 

Second, if F satisfies the conditions, then for any x = Xlies©) have 


F(x)| 


6 © 

ieS 

\b\ + '^\vi\-2'^{b-Vi)-2 {vi-Vj) + A{---) 

i€S i&S i&S < jeS 

Y 1^*1 - 2 (6 • Vi) 
i£S 

Y \0i\ (mod4), 
i£S 


where the second line follows from Lemma [T71 Furthermore, we have that 


X 




^loil-2 Y (©•©•)+ 4(. ^ loil (mod4), 

ies i€S<jeS i&S 


where the last equality follows from the fact that is an orthonormal basis. Therefore, we 

conclude that lT(x)l = 1x1 (mod4). ■ 

We note two corollaries of Theorem [20] for later use. 


Corollary 21 Any linear transformation A E that preserves Hamming weight mod 4 is also 

orthogonal. 

Corollary 22 An orthogonal transformation A E preserves Hamming weight mod 4 if and 

only if all of its columns have Hamming weight 1 mod 4. 


7 Reversible Circuit Constructions 

In this section, we show that all the classes of reversible transformations listed in Theorem [3l are 
indeed generated by the gates that we claimed, by giving explicit synthesis procedures. In order to 
justify Theorem [ 8 l we also verify that in each case, only 0(1) ancilla bits are needed, even though 
this constraint makes some of the constructions more complicated than otherwise. 

Many of our constructions—those for Toffoli and CNOT, for example—have appeared in various 
forms in the reversible computing literature, and are included here only for completeness. Others— 
those for and F 4 , for example—are new as far as we know, but not hard. 
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7.1 Non-AfRne Circuits 

We start with the non-afhne classes: (Toffoli), (Fredkin), (Fredkin, C^), and (Fredkin, NOT). 

Theorem 23 (variants in |28l 125] i Toffoli generates all reversible transformations on n bits, 
using only 2 ancilla 

Proof. Any reversible transformation F : {0,1}” —?■ {0,1}"" is a permutation of n-bit strings, 
and any permutation can be written as a product of transpositions. So it suffices to show how 
to use Toffoli gates to implement an arbitrary transposition ay^z- that is, a mapping that sends 
y = yi... yn to z = zi... Zn and z to y, and all other n-bit strings to themselves. 

Given any n-bit string w, let us define tc-CNOT to be the {n + l)-bit gate that flips its last 
bit if its hrst n bits are equal to w, and that does nothing otherwise. (Thus, the Toffoli gate is 
11-CNOT, while CNOT itself is 1-CNOT.) Given y-CNOT and z-GNOT gates, we can implement 
the transposition ay^z as follows on input x: 

1. Initialize an ancilla bit, a = 1. 

2. Apply y-CNOT (x,a). 

3. Apply z-CNOT (x, a). 

4. Apply NOT gates to all xfs such that yi ^ Zi. 

5. For each i such that y* / zi, apply GNOT {a,Xi). 

6. Apply z-GNOT (x, a). 

7. Apply y-CNOT (x,a). 

Thus, all that remains is to implement rc-CNOT using Toffoli. Observe that we can simulate 
any rc-CNOT using 1^-GNOT, by negating certain input bits (namely, those for which Wi = 0) 
before and after we apply the 1^-CNOT. An example of the transposition cronqoi is given in 
Figured! 



Figure 4: Generating the transposition fJonpoi 

So it suffices to implement l”-GNOT, with control bits xi... x„ and target bit y. The base 
case is n = 2, which we implement directly using Toffoli. For n > 3, we do the following. 

• Let a be an ancilla. 

^^Notice that we need at least 2 so that we can generate CNOT and NOT using Toffoli. 
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• Apply {xi .. .x^n/ 2 ], a). 

• Apply 1 L^/2J+i_CN OT (x p„/ 2 ]+i . ..Xn, a, y). 

• Apply ir^/21-CNOT (xi .. .xp„/ 2 ],a). 

• Apply iL^/2J+i_q]sjot (xp„/ 2 ]+i.. .Xn,a,y). 

The crucial point is that this construction works whether the ancilla is initially 0 or 1. In other 
words, we can use any bit which is not one of the inputs, instead of a new ancilla. For instance, we 
can have one bit dedicated for use in l"'-CNOT gates, which we use in the recursive applications 
of ll^"/^^-CNOT and iL^-Z^J+^-CNOT, and the recursive applications within them, and so on0 

Carefully inspecting the above proof shows that O gates and 3 ancilla bits suffice to 

generate any transformation. Notice the main reason we need two of the three ancillas is to apply 
the NOT gate while the ancilla a is active. Case analysis shows that any circuit constructible from 
NOT, CNOT, and Toffoli is equivalent to a circuit of NOT gates followed by a circuit of CNOT and 
Toffoli gates. For example, see Figure [5l This at most triples the size of the circuit. Therefore, 
we can construct a circuit that uses only two ancilla bits: apply the recursive construction, push 
the NOT gates to the front, and use two ancilla bits to generate the NOT gates. The recursive 
construction itself uses one ancilla bit, plus one more to implement CNOT. ■ 


- { 

- N - - N - 

1- 


—( 

)- -^ 

5 —( 

)— 


Figure 5: Example of equivalent Toffoli circuit with NOT gates pushed to the front 

The particular construction above was inspired by a result of Ben-Or and Cleve [6], in which 
they compute algebraic formulas in a straight-line computation model using a constant number of 
registers. We note that Toffoli |28] proved a version of Theorem but with O (n) ancilla bits 
rather than O (1). More recently, Shende et al. |25] gave a slightly more complicated construction 
which uses only 1 ancilla bit, and also gives explicit bounds on the number of Toffoli gates required 
based on the number of fixed points of the permutation. Recall that at least 1 ancilla bit is needed 
by Proposition [9l 

Next, let CCSWAP, or Controlled-Controlled-SWAP, be the d-bit gate that swaps its last two 
bits if its first two bits are both 1, and otherwise does nothing. 

Proposition 24 Fredkin generates CCSWAP. 

Proof. Let a be an ancilla bit initialized to 0. We implement CCSWAP (x, y, z, w) by applying 
Fredkin (x, y, a), then Fredkin (a, z, w), then again Fredkin (x, y,a). ■ 

We can now prove an analogue of Theorem [23] for conservative transformations. 

number of Toffoli gates T{n) needed to implement a l"-CNOT (which dominates the cost of a transposition) 
by this recursive scheme, is given by the recurrence 

r(n) = 2r(l+ Ln/2J) +2T(ln/2]) 

which we solve to obtain T (n) = O (r?)- 


29 















Theorem 25 Fredkin generates all conservative transformations on n bits, using only 5 ancilla 
bits. 

Proof. In this proof, we will use the dual-rail representation, in which 0 is encoded as 01 and 1 is 
encoded as 10. We will also use Proposition [24l that Fredkin generates CCSWAP. 

As in Theorem [23l we can decompose any reversible transformation F : {0, !}”■ —)• {0, !}”■ as 
a product of transpositions cTy^z- In this case, each ay^z transposes two n-bit strings y = yi ■ ■ - yn 
and z = zi... Zn oi the same Hamming weight. 

Given any n-bit string w, let us define rc-CSWAP to be the (n -|- 2)-bit gate that swaps its last 
two bits if its first n bits are equal to w, and that does nothing otherwise. (Thus, Fredkin is 
1-CSWAP, while CCSWAP is 11-CSWAP.) Then given ?/-CSWAP and z-CSWAP gates, where 
\y\ = \z\, as well as CCSWAP gates, we can implement the transposition Uy^z on input x as follows: 

1. Initialize two ancilla bits (comprising three dual-rail registers) to an = 01. 

2. Apply y-CSWAP (xi... a, a). 

3. Apply z-CSWAP (xi... a, a). 

4. Pair off the z’s such that = 1 and zi = 0, with the equally many j’s such that Zj = 1 and 
yj = 0. For each such (i,j) pair, apply Fredkin {a,Xi,Xj). 

5. Apply z-CSWAP (xi... fljo)- 

6. Apply y-CSWAP (xi... a, a). 

The logic here is exactly the same as in the construction of transpositions in Theorem 1231 the 
only difference is that now we need to conserve Hamming weight. 

All that remains is to implement tc-CSWAP using CCSWAP. First let us show how to imple¬ 
ment 1”’-CSWAP using CCSWAP. Once again, we do so using a recursive construction. For the 
base case, n = 2, we just use CCSWAP. For n > 3, we implement 1”'-CSWAP (xi ,... ,Xn,y, z) as 
follows: 

• Initialize two ancilla bits (comprising one dual-rail register) to ad = 01. 

• Apply 1-CSWAP (xi... x |-„/ 2 ], a, a). 

• Apply iL’"/ 2J+1-CSWAP (xp„/ 2 ]+i.. .x„,a,y,z). 

• Apply 1 r"/2l -CSWAP (xi... X |-„/ 2 ], a, a). 

• Apply iL"/2J+i.csWAP (xp„/ 2 ]+i.. .x„,a,y,z). 

The logic is the same as in the construction of l”-CNOT in Theorem 1231 except we now use 2 
ancilla bits for the dual rail representation. 

Finally, we need to implement tc-CSWAP (xi ... x^, y, z), for arbitrary w, using 1”-CSWAP. 
We do so by first constructing tc-CSWAP from NOT gates and 1”-CSWAP. Observe that we only 
use the NOT gate on the control bits of the Fredkin gates used during the construction so the 
equivalence given in Figure [6] holds (i.e., we can remove the NOT gates). 
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Figure 6: Removing NOT gates from the Predkin circuit 


Hence, we can build a tc-CSWAP out of CCSWAPs using only 5 ancilla bits: 1 for CCSWAP, 
2 for the 1”’-CSWAP, and 2 for a transposition. ■ 

We note that, before the above construction was found by the authors, unpublished and inde¬ 
pendent work by Siyao Xu and Qian Yu first showed that 0(1) ancillas were sufficient. 

In [To], the result that Predkin generates all conservative transformations is stated without 
proof, and credited to B. Silver. We do not know how many ancilla bits Silver’s construction used. 

Next, we prove an analogue of Theorem 1231 for the mod-A:-respecting transformations, for all 
k > 2. First, let CCfc, or Controlled-Cfc, be the {k + l)-bit gate that applies to the hnal k bits 
if the first bit is 1 , and does nothing if the hrst bit is 0 . 

Proposition 26 Predkin-|-Cfc generates CCfc, using 2 ancilla bits, for all k>2. 

Proof. To implement CCk on input bits x,yi ... yk, we do the following: 

1. Initialize ancilla bits a,b to 0,1 respectively. 

2. Use Predkin gates and swaps to swap yi,y 2 with a,b, conditioned on x = 0@ 

3. Apply Cfe to yi ...yk- 

4. Repeat step 2. 

■ 

Then we have the following. 

Theorem 27 Predkin-|-CCfc generates all mod-k-preserving transformations, fork > 1, using only 
5 ancilla bits. 

Proof. The proof is exactly the same as that of Theorem 1251 except for one detail. Namely, let y 
and z be n-bit strings such that \y\ = \z\ (mod A:). Then in the construction of the transposition 
ay^z from y-CSWAP and z-CSWAP gates, when we are applying step 5, it is possible that \y\ — \z\ 
is some nonzero multiple of k, say qk. If so, then we can no longer pair off each i such that yi = 1 
and Zi = 0 with a unique j such that Zj = 1 and yj = 0 : after we have done that, there will remain 
a surplus of ‘ 1 ’ bits of size qk, either in y or in z, as well as a matching surplus of ‘ 0 ’ bits of size qk 
in the other string. However, we can get rid of both surpluses using q applications of a CCk gate 
(which we have by Proposition I26|) . with c as the control bit. ■ 

As a special case of Theorem 1271 note that Predkin-|-CCi = Predkin-|-CNOT generates all 
mod- 1 -preserving transformations—or in other words, all transformations. 

We just need one additional fact about the gate. 

^®In more detail, use Predkin gates to swap t/r,J /2 with a,b, conditioned on a; = 1. Then swap yi,y 2 with a,b 
unconditionally. 
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Proposition 28 generates Fredkin, using k — 2 ancilla bits, for all k > 3. 


Proof. Let ai ... ak -2 be ancilla bits initially set to 1. Then to implement Fredkin on inpnt bits 
X, y, z, we apply: 


Cfc (x, y, ai ... Ok—2), 
Cfc (x, z, ai ... o/j_2 ), 
Cfc ix,y, oi ... ak-2) ■ 


Combining Theorem 1271 with Proposition 1281 now yields the following. 

Corollary 29 Ck generates all mod-k-preserving transformations for k > 3, using only k+3 ancilla 
bits. 

Finally, we handle the parity-flipping case. 

Proposition 30 Fredkin-|-NOTNOT (and hence, Fredkin-|-NOTJ generates CC 2 . 

Proof. This follows from Proposition 1261 if we recall that C 2 is eqnivalent to NOTNOT up to an 
irrelevant bit-swap. ■ 

Theorem 31 Fredkin-|-NOT generates all parity-respecting transformations on n bits, using only 
6 ancilla bits. 

Proof. Let F be any parity-flipping transformation on n bits. Then F 0 NOT is an (n -|- 1)- 
bit parity-preserving transformation. So by Theorem [271 we can implement F ( 8 ) NOT using 
Fredkin-|-CC 2 (and we have CC 2 by Proposition (301). We can then apply a NOT gate to the 
(n -|- 1)®* bit to get F alone. ■ 

One consequence of Theorem 13II is that every parity-flipping transformation can be constructed 
from parity-preserving gates and exactly one NOT gate. 

7.2 AfRne Circuits 

It is well-known that CNOT is a “universal affine gate”: 

Theorem 32 CNOT generates all affine transformations, with only 1 ancilla bit (or 0 for linear 
transformations). 

Proof. Let G (x) = Ax © 6 be the affine transformation that we want to implement, for some 
invertible matrix A € Then given an input x = xi... Xn, we hrst use CNOT gates (at most 

( 2 ) of them) to map x to Ax, by reversing the sequence of row-operations that maps A to the 
identity matrix in Gaussian elimination. Finally, if 6 = 61 ... is nonzero, then for each i such 
that = 1, we apply a CNOT from an ancilla bit that is initialized to 1. ■ 

A simple modification of Theorem 1321 handles the parity-preserving case. 

Theorem 33 CNOTNOT generates all parity-preserving affine transformations with only 1 ancilla 
bit (or 0 for linear transformations). 
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Proof. Let G ( x) = Ax © 6 be a parity-preserving affine transformation. We first construct 
the linear part of G using Gaussian elimination. Notice that for G to be parity-preserving, the 
columns Vi of A must satisfy \vi\ = 1 (mod2) for all i. For this reason, the row-elimination steps 
come in pairs, so we can implement them using CNOTNOT. Notice further that since G is parity¬ 
preserving, we must have \b\ = 0(mod2). So we can map Ax to Ax © b, by using CNOTNOT 
gates plus one ancilla bit set to 1 to simulate NOTNOT gates. ■ 

Likewise (though, strictly speaking, we will not need this for the proof of Theorem [3]): 

Theorem 34 CNOTNOT ©NOT generates all parity-respecting affine transformations using no 
ancilla bits. 

Proof. Use Theorem [331 to map x to Ax, and then use NOT gates to map Ax to Ax © 6 . ■ 

We now move on to the more complicated cases of (F 4 ), (Tg), and (T 4 ). 

Theorem 35 F 4 generates all mod-A-preserving affine transformations using no ancilla bits. 

Proof. Let F (x) = Ax(Bb be an n-bit affine transformation, n > 2, that preserves Hamming weight 
mod 4. Using F 4 gates, we will show how to map F (x) = yi... yn to x = xi... Xn. Reversing the 
construction then yields the desired map from x to F (x). 

At any point in time, each yj is some affine function of the Xj’s. We say that Xj “occurs in” 
Pj, if pj depends on x*. At a high level, our procedure will consist of the following steps, repeated 
up to n — 3 times: 

1. Find an Xj that does not occur in every pj. 

2. Manipulate the pfs so that Xj occurs in exactly one pj. 

3. Argue that no other Xj/ can then occur in that pj. Therefore, we have recursively reduced our 
problem to one involving a reversible, mod-4-preserving, affine function on n — 1 variables. 

It is not hard to see that the only mod-4-preserving affine functions on 3 or fewer variables, are 
permutations of the bits. So if we can show that the three steps above can always be carried out, 
then we are done. 

First, since A is invertible, it is not the all-l’s matrix, which means that there must be an Xj 
that does not occur in every pj. 

Second, if there are at least three occurrences of x*, then apply F 4 to three positions in which x* 
occurs, plus one position in which Xj does not occur. The result of this is to decrease the number 
of occurrences of Xj by 2 . Repeat until there are at most two occurrences of Xj. Since F 4 is mod-4- 
preserving and affine, the resulting transformation F' (x) = A'x + b' must still be mod-4-preserving 
and affine, so it must still satisfy the conditions of Lemma [20l In particular, no column vector of 
A' can have even Hamming weight. Since two occurrences of Xj would necessitate such a column 
vector, we know that Xj must occur only once. 

Third, if Xj occurs only once in F'(x), then the corresponding column vector Vi has exactly 
one nonzero element. Since \vi\ = 1, we know by Lemma [20] that Vi ■ b = 0 (mod 2 ), which means 
that b has a 0 in the position where Vi has a 1. Now consider the row of A' that includes the 
nonzero entry of Vi. If any other column Xj/ is also nonzero in that row, then Vi ■ Vi' = 1 (mod2), 
which once again contradicts the conditions of Lemma (201 Thus, no other Xj' occurs in the same 
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Uj that Xi occurs in. Indeed no constant occurs there either, since otherwise F' would no longer 
be mod-4-preserving. So we have reduced to the (n — 1) x (n — 1) case. ■ 

The same argument, with slight modifications, handles (T 4 ) and (Tg). 

Theorem 36 T 4 generates all orthogonal transformations, using no ancilla bits. 

Proof. The construction is identical to that of Theorem 1351 except with T 4 instead of F 4 . When 
reducing the number of occurrences of Xi to at most 2, Lemma [16] assures us that |ujl = 1 (mod2). 


Theorem 37 Tg generates all mod-A-preserving linear transformations, using no ancilla bits. 

Proof. The construction is identical to that of Theorem (351 except for the following change. 
Rather than using F 4 to reduce the number of occurrences of some Xi to at most 2, we now use 
Tg to reduce the number of occurrences of Xi to at most 4. (If there are 5 or more occurrences, 
then Tg can always decrease the number by 4.) We then appeal to Corollary 1221 which says that 
\vi\ = 1 (mod4) for each i. This implies that no Xi can occur 2, 3, or 4 times in the output vector. 
But that can only mean that Xj occurs once. ■ 

By Lemma [TTI and Corollary an equivalent way to state Theorem [33 is that Tg generates 
all affine transformations that are both mod-4-preserving and orthogonal. 

All that remains is some “cleanup work” (which, again, is not even needed for the proof of 
Theorem [3]) . 

Theorem 38 Tg + NOT generates all affine transformations that are mod-A-preserving (and there¬ 
fore orthogonal) in their linear part. 

Tg + NOTNOT generates all parity-preserving affine transformations that are mod-A-preserving 
(and therefore orthogonal) in their linear part. 

F 4 + NOT (or equivalently, T 4 + NOTj generates all isometries. 

F 4 + NOTNOT (or equivalently, T 4 + NOTNOTJ generates all parity-preserving isometries. 
NOT generates all degenerate transformations. 

NOTNOT generates all parity-preserving degenerate transformations. 

In none of these cases are any ancilla hits needed. 

Proof. As in Theorem [34l we simply apply the relevant construction for the linear part (e.g.. 
Theorem [361 or [37]) . then handle the affine part using NOT or NOTNOT gates. ■ 

8 The Non-Affine Part 

Our goal, in this section, is to prove that there are no non-affine classes besides the ones listed in 
Theorem [3] namely, the conservative transformations, the parity-respecting transformations, the 
mod-fc-preserving transformations for k >2, and all transformations. 

We will divide our analysis into two parts. We first show, in Section 18.11 that once a Fredkin 
gate is available, matters become fairly simple. At that point, the only possibilities are (Fredkin), 
(Fredkin,NOTNOT), (Fredkin,NOT), (C^) for k>3, and (Toffoli). Then, in Sections ( 8 ^ and ( 8 ^ 
we prove the harder result that every non-affine gate generates Fredkin. This, in turn, is broken 
into three pieces: 
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• In Section 18.21 we reprove a result of Lloyd [T9] , showing that every non-afhne gate is capable 
of universal computation with garbage. 

• In Section 18.81 we show that every nontrivial conservative gate generates Fredkin (using the 
result of Section [821 as one ingredient). 

• In Section 18.41 we build on the result of Section 18.31 to show that every non-afhne gate 
generates Fredkin. This requires our hrst use of lattices, and also draws on some of the 
results about inner products and modularity obstructions from Section [6l 

Summarizing the results of this section, we will obtain the following. 

Theorem 39 Every non-affine gate set generates one of the following classes: (Fredkin), (Cfc) for 
some k>3, (Fredkin, NOTNOT), (Fredkin, NOT), or (Toffoli). 

8.1 Above Fredkin 

Our goal, in this section, is to classify all reversible gate classes containing Fredkin. We already 
know from Theorem 1251 that Fredkin generates all conservative transformations. We will prove a 
substantial generalization of that result. First, however, we need a proposition that will also be 
used later in the paper. Given a reversible transformation G, let 

W{G) :={|G(x)|-lx| :xG{0,l}”} 

be the set of possible changes that G can cause to the Hamming weight of its input. 

Proposition 40 Let G he any non-conservative gate. Then for all integers q, there exists a t such 
that q-k{G)eW (G®*). 

Proof. Let 7 be the gcd of the elements in W (G). Then clearly G is mod-y-respecting. By 
Proposition dl this means that 7 must divide k (G)cj 

Now by reversibility, W (G) has both positive and negative elements. But this means that we 
can find any integer multiple of 7 in some set of the form 

w (G®^) = {tci H - \-Wm-Wi,...,Wm^W (G)} . 

Therefore we can find any integer multiple of k (G) in some W (G®*) as well. ■ 

We can now characterize all reversible gate sets that contain Fredkin. 

Theorem 41 Let G be any gate. Then Fredkin-|-G generates all mod-k (G)-preserving transfor¬ 
mations (including in the cases k (G) = 1, in which case Fredkin-|-G generates all transformations, 
and k{G) = 00 , in which case Fredkin-|-G generates all conservative transformations). 

Proof. Let k = k{G). li k = 00 then we are done by Theorem 1251 so assume k is finite. We 
will assume without loss of generality that G is mod-A:-preserving. By Theorem 1121 the only other 
possibility is that G is parity-flipping, but in that case we can simply repeat everything below with 
G ( 8 > G, which is parity-preserving and satisfies k{G ® G) = 2 , rather than with G itself. 

^®Indeed, by using Theorem m one can show that y = k (G), except in the special case that G is parity-flipping, 
where we have 7 = 1 and k (G) = 2. 
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By Theorem [271 it suffices to use Fredkin +G to generate the CCfc gate. Let H be the gate 
G® G~^, followed by a swap of the two input registers. Observe that is the identity. Also, by 
Proposition [21 

k {H) = gcd {k (G), k (G"^)) = k. 

So by Propositionlini there exists a positive integer t, as well as inputs y = yi ■ ■ - Vn and z = zi ... Zn 
such that z = (y) (and y = (z), since = I), and \z\ = |y| + k. 

We can assume without loss of generality that y has the form 0“1^—i.e., that its bits are in 
sorted order. We would like to sort the bits of z as well. Notice that, since |z| > |y|, there is some 
i E [n] such that yi = 0 and Zi = 1. So we can easily design a circuit U of Fredkin gates, controlled 
by bit i, which reorders the bits of z so that 

/:=?7(2) = 0“-h'’+^ 


whereas U (y) = y. 

Observe that Lf®* has a large number of fixed points: we have H (u, G (u)) = {u, G (u)) for any 
u] hence any string of the form ui, G (ui ),... ,ut,G (ut) is a fixed point of Ff®*. Call one of these 
fixed points w, and let w' := U (w). 

We now consider a circuit R that applies U~^, followed by FF®*, followed by U. This R satisfies 
the following identities: 

R{y) = U (FF®* (F/-1 (y))) = U (FF®' (y)) =U{z) = z'. 

R {z') = U (FF®* (F/-1 (/))) = U (FF®* (z)) = U (y) = y. 

R [w') = U (FF®* (F/-1 {w'))) = U (FF®* (w)) =U{w)= w'. 

Using FF, we now construct CC^ {xi.. .Xk,c). Let A and B be two n-bit registers, initialized to 
A := w' and B := Also, let qq be two ancilla bits in dual-rail representation, 

initialized to qq = 01. Then to apply CC^, we do the following: 

1. Swap q with q if and only if xi = • • • = and c = 1. 

2. Swap A with B if and only if g = 1. 

3. Apply R to the A register. 

4. Swap A with B if and only if g = 1. 

5. Swap q with q if and only if xi = • • • = Xfc and c = 1. 

Here each conditional swap is implemented using Fredkin gates; recall from Theorem 1251 that 
Fredkin generates every conservative transformation. 

It is not hard to check that the above sequence maps xi... Xfc = 0*^ to 1^ and xi... x^ = 1^ to 
0*^ if c = 1, otherwise it maps the inputs to themselves. Furthermore, the ancilla bits are returned 
to their original states in all cases, since w' is a fixed point of R. Therefore we have implemented 
CCfc. ■ 

Theorem 1411 has the following corollary. 

Corollary 42 Let S be any non-conservative gate set. Then Fredkin-|-S' generates one of the 
following classes: (Fredkin,NOTNOT), (Fredkin,NOT), (Cfc) for some k>3, or (Toffoli). 
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Proof. We know from Proposition [2] that S generates a single gate G such that k {G) = k(S). 

If k{S) > 3, then Theorem 1411 implies that Fredkin+G generates all /c (S')-preserving transfor¬ 
mations, which equals (0^(5)) by Corollary [29l If k{S) = 2 and S is parity-preserving, then 
Theorem 141 1 implies that Fredkin-|-G generates all parity-preserving transformations, which equals 
(Fredkin, NOTNOT) by Proposition [30l If k{S) = 1, then Theorem HT] implies that Fredkin-I-G 
generates all transformations, which equals (Toffoli) by Theorem 1231 

By Theorem [T^ the one remaining case is that k{S) = 2 and some G G S is parity-flipping. 
By Theorem 1411 certainly Fredkin-|-G at least generates all parity-preserving transformations. 
Furthermore, let F be any parity-flipping transformation. Then F (g) G~^ is parity-preserving. So 
we can use Fredkin +G to implement F (S> G~^, then compose with G itself to get F. Therefore we 
generate all parity-flipping transformations, which equals (Fredkin, NOT) by Theorem 1311 ■ 

8.2 Computing with Garbage 

For completeness, in this section we reprove some lemmas first shown by Seth Lloyd m in an 
unpublished 1992 technical report^ and later rediscovered by Kerntopf et al. m and De Vos and 
Storme |29j . We will use these lemmas to show the power of non-affine gates. 

Recall the notion of generating with garbage from Section 12.31 

Lemma 43 f |19L [29]) Every nontrivial reversible gate G generates NOT with garbage. 

Proof. Let G {xi... Xn) = yi... yn be nontrivial, and let yi = fi (xi... Xn). Then it suffices to 
show that at least one fi is a non-monotone Boolean function. For if fi is non-monotone, then 
by definition, there exist two inputs x,x' G {0,1}"", which are identical except that Xj = 1 and 
x'j = 0 at some bit j, such that fi (x) = 0 and fi {x') = 1. But then, if we set the other n — 1 bits 
consistent with x and x', we have yi = NOT {xj). 

Thus, suppose by contradiction that every fi is monotone. Then reversibility clearly implies 
that G (0”) = 0”’, and that the set of strings of Hamming weight 1 is mapped to itself: that is, there 
exists a permutation a such that G (cj) = eo-(j) for all j. Furthermore, by monotonicity, for all 
j k we have G {ej © e^) > Co-q) © eo-(fc). But then reversibility implies that G {ej © e^) can only 
be ea-(j) © eo-(fc) itself, and so on inductively, so that we obtain G {xi... Xn) = ... Xg—for 

all X G {0,1}”". But this means that G is trivial, contradiction. ■ 

Proposition 44 (folklore) For all n > 3, every non-affine Boolean function on n bits has a 
non-affine subfunction on n — 1 bits. 

Proof. Let / : {0,1}” —{0,1} be non-affine, and let /o and /i be the (n — l)-bit subfunctions 
obtained by restricting /’s first input bit to 0 or 1 respectively. If either /o or fi is itself non-affine, 
then we are done. Otherwise, we have /o (x) = (oq • x) © 6 o and fi (x) = (ai • x) © 6 i, for some 
ao, oi G {0, and bo, bi G {0,1}. Notice that / is non-affine if and only if oq 7 ^ ai. So there 

is some bit where qq and oi are unequal. If we now remove any of the other rightmost n — 1 input 
bits (which must exist since re — 1 >2) from /, then we are left with a non-affine function on re — 1 
bits. ■ 

Lemma 45 1 |19L 129] 1 Every non-affine reversible gate G generates the 2-bit ANY) gate with garbage. 
^’^Prompted by the present work, Lloyd has recently posted his 1992 report to the arXiv. 
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Proof. Certainly every non-affine gate is nontrivial, so we know from Lemma 03] that G generates 
NOT with garbage. For this reason, it suffices to show that G can generate some non-affine 2-bit 
gate with garbage (since all such gates are equivalent to AND under negating inputs and outputs). 
Let G {xi ... Xn) = yi... i/n, and let y* = /* (xi... Xn)- Then some particular /j must be a non- 
affine Boolean function. So it suffices to show that, by restricting re — 2 of /j’s input bits, we can 
get a non-affine function on 2 bits. But this follows by inductively applying Proposition |44| ■ 

By using Lemma 05] it is possible to prove directly that the only classes that contain a CNOT 
gate are (CNOT) (i.e., all affine transformations) and (Toffoli) (i.e., all transformations)—or in 
other words, that if G is any non-affine gate, then (CNOT, G) = (Toffoli). However, we will skip 
this result, since it is subsumed by our later results. 

Recall that COPY is the 2-bit partial gate that maps 00 to 00 and 10 to 11. 

Lemma 46 f |19L I13] l Every non-degenerate reversible gate G generates COPY with garbage. 

Proof. Certainly every non-degenerate gate is nontrivial, so we know from Lemma 03] that G 
generates NOT with garbage. So it suffices to show that there is some pair of inputs x, x' G {0, !}"■, 
which differ only at a single coordinate z, such that G (x) and G {x') have Hamming distance at 
least 2. For then if we set x* := z, and regard the remaining re — 1 coordinates of x as ancillas, 
we will find at least two copies of 2 ; or z in G (x), which we can convert to at least two copies of ^ 
using NOT gates. Also, if all of the ancilla bits that receive a copy of z were initially 1, then we 
can use a NOT gate to reduce to the case where one of them was initially 0. 

Thus, suppose by contradiction that G (x) and G (x') are neighbors on the Hamming cube 
whenever x and x' are neighbors. Then starting from 0*^ and G(0”), we find that every G {ei) 
must be a neighbor of G (0”'), every G (e* 0 ej) must be a neighbor of G (cj) and G (e^), and so on, 
so that G is just a rotation and reflection of {0,1}”'. But that means G is degenerate, contradiction. 


8.3 Conservative Generates Predkin 

In this section, we prove the following theorem. 

Theorem 47 Let G be any nontrivial conservative gate. Then G generates Fredkin. 


The proof will be slightly more complicated than necessary, but we will then reuse parts of it 
in Section 18.41 when we show that every non-affine, non-conservative gate generates Fredkin. 

Given a gate Q, let us call Q strong quasi-Fredkin if there exist control strings a,b,c,d such 
that 


Q (a, 01) = (a, 01), (3) 

Q (6,01) = (6,10), (4) 

Q(c,00) = (c,00), (5) 

Q{d,ll) = {d,ll). (6) 


Lemma 48 Let G be any nontrivial n-bit conservative gate. Then G generates a strong quasi- 
Fredkin gate. 
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Proof. By conservativity, G maps unit vectors to unit vectors, say G (ci) = for some per¬ 
mutation TT. But since G is nontrivial, there is some input x G {0,1}*^ such that Xi = 1, but the 
corresponding bit vr (i) in G (x) is 0. By conservativity, there must also be some bit j such that 
Xj = 0, but bit TT (j) of G (x) is 1. Now permute the inputs to make bit j and bit i the last two 
bits, permute the outputs to make bits vr (j) and vr (i) the last two bits, and permute either inputs 
or outputs to make x match G (x) on the first n — 2 bits. After these permutations are performed, 
X has the form rcOl for some w G {0,So 

G( 0 ”"^ 01 ) = (O'^-^Ol) , 

G {w, 01) = {w, 10), 

G(0”"^00) = (O'^-^OO) , 

G ( 1 ^- 2 , 11 ) = (ir- 2 , 11 ), 

where the last two lines again follow from conservativity. Hence G (after these permutations) 
satisfies the definition of a strong quasi-Fredkin gate. ■ 

Next, call a gate G a catalyzer if, for every x G {0,1}^” with Hamming weight n, there exists a 
“program string” p (x) such that 

C(p(x) , 0 ^ 1 ”) = (p(x) ,x). 

In other words, a catalyzer can be used to transform into any target string x of Hamming 

weight n. Here x can be encoded in any manner of our choice into the auxiliary program string 
p (x), as long as p (x) is left unchanged by the transformation. The catalyzer itself cannot depend 
on X. 

Lemma 49 Let Q be a strong quasi-Fredkin gate. Then Q generates a catalyzer. 

Proof. Let z ;= O^l*^ be the string that we wish to transform. For all i G {l,...,n} and 
j G {n -|- 1,..., 2n}, let Sij denote the operation that swaps the and bit of z. Then consider 
the following list of “candidate swaps”: 


•Sl,n-|-1) • • • ) 'Sl,2n) ^’2,n-|-l) • • • ; ^’2,2n) • • • ) ■ ■ ■ : Sn, 2 n- 

Suppose we go through the list in order from left to right, and for each swap in the list, get to 
choose whether to apply it or not. It is not hard to see that, by making these choices, we can map 
Qn\n ^ guch that |x| = n, by pairing off the first 0 bit that should be 1 with the first 1 bit 

that should be 0 , the second 0 bit that should be 1 with the second 1 bit that should be 0 , and so 
on, and choosing to swap those pairs of bits and not any other pairs. 

Now, let the program string p (x) be divided into registers ri,..., r„ 2 , each of the same size. 
Suppose that, rather than applying (or not applying) the swap Sij in the list, we instead apply 
the gate F, with rt as the control string, and Zi and Zj as the target bits. Then we claim that we 
can map z to x as well. If the candidate swap is supposed to occur, then we set r* := b. If 
the t^^ candidate swap is not supposed to occur, then we set rt to either a, c, or d, depending on 
whether ZiZj equals 01, 00, or 11 at step t of the swapping process. Note that, because we know x 
when designing p{x), we know exactly what ztzj is going to be at each time step. Also, ztzj will 
never equal 10 , because of the order in which we perform the swaps: we swap each 0 bit zt that 
needs to be swapped with the first 1 bit Zj that we can. After we have performed the swap, Zi = 1 
will then only be compared against other 1 bits, never against 0 bits. ■ 

Finally: 
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Lemma 50 Let G he any non-affine gate, and let C he any catalyzer. Then G + C generates 
Fredkin. 


Proof. We will actually show how to generate any conservative transformation F : {0,1}” ^ 

{o,ir. 

Since G is non-affine, Lemmas H3l H5l and 06] together imply that we can use G to compute any 
Boolean function, albeit possibly with input-dependent garbage. 

Let X G {0,1}”. Then by assumption, G maps O”!”" to F (x) F (x) using the program string 
p{F (x) F (x)). Now, starting with x and ancillas we can clearly use G to produce 

X, gar (x) ,p{F (x) F (x)), 


for some garbage gar (x). We can then apply G to get 


X, gar (x) ,p{F (x) F {x)),F (x) , F (x). 


Uncomputing p[F (x) F (x)) yields 


X, F (x) ,F (x). 


Notice that since F is conservative, we have x, F (x) = n. Therefore, there exists some program 

string p{x, F (x)) that can be used as input to G~^ to map x, F (x) to Again, we can generate 

this program string using the fact that G is non-affine: 


X, F (x) ,F (x), gar {F (x)) ,p{x, F (x)). 


Applying G ^ and then uncomputing, we get 


F(x),0"r 


which completes the proof. ■ 

By Lemma [T8l every nontrivial conservative gate is also non-affine. Therefore, combining 
Lemmas 0810^ and [50] completes the proof of Theorem 071 that every nontrivial conservative gate 
generates Fredkin. 

8.4 Non-Conservative Generates Fredkin 

Building on our work in Section 18.31 in this section we handle the non-conservative case, proving 
the following theorem. 

Theorem 51 Every non-affine, non-conservative gate generates Fredkin. 

Thus, let G be a non-affine, non-conservative gate. Starting from G, we will perform a sequence 
of transformations to produce gates that are “gradually closer” to Fredkin. Some of these trans¬ 
formations might look a bit mysterious, but they will culminate in a strong quasi-Fredkin gate, 
which we already know from Lemmas 09] and (50] is enough to generate a Fredkin gate (since G is 
also non-affine). 

The first step is to create a non-affine gate with two particular inputs as hxed points. 
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Lemma 52 Let G he any non-affine gate on n bits. Then G generates a non-affine gate H on 

2 2 

hits that acts as the identity on the inputs 0 "^ and . 

Proof. We construct H as follows: 

1. Apply to input bits. Let Gi be the gate in this tensor product. 

2. For all i G [n — 1], swap the output bit of Gi with the i^^ output bit of Gn- 

3. Apply (G-i)®”. 

2 2 2 2 

It is easy to see that H maps 0”' to 0” and to . (Indeed, H maps every input that 
consists of an re-bit string repeated re times to itself.) To see that H is also non-affine, first notice 
that G~^ is non-affine. But we can cause any input x = xi... Xn that we like to be fed into the 
final copy of G~^, by encoding that input “diagonally,” with each Gi producing Xi as its output 
bit. Therefore H is non-affine. ■ 

As a remark, with all the later transformations we perform, we will want to maintain the 
property that the all-0 and all-1 inputs are hxed points. Fortunately, this will not be hard to 
arrange. 

Let H be the output of Lemma [5^ If H is conservative (i.e., k{H) = oo), then H already 
generates Fredkin by Theorem 1471 so we are done. Thus, we will assume in what follows that k (H) 
is finite. We will further assume that H is mod-Zc (F7)-preserving. By Theorem 1121 the only gates 
H that are not mod-Zc (i7)-preserving are the parity-flipping gates—but if H is parity-flipping, then 
H <Si H is parity-preserving, and we can simply repeat the whole construction with H ® H m. place 
of H. 

Now we want to show that we can use H to decrease the inner product between a pair of inputs 
by exactly 1 mod m, for any rre we like. 

Lemma 53 Let H be any non-conservative, nonlinear gate. Then for all m > 2, there is a positive 
integer t, and inputs x, y, such that 

H®^{x) ■ — X ■ y = —1 (mod rre). 

Proof. Let rre = pf^pf^ • • • where each pi is a distinct prime. By Corollary [T^ we know that 
for each pi , there is some pair of inputs Xi , yi such that 


H (xi) ■ H {yi) ^Xi-yi (modpi). 


In other words, letting 


7i := H (xi) • H {yi) - Xi ■ pi, 

we have 7 i ^ 0 (modpi) for all r G {1,..., s}. Our goal is to hnd an (x, y) such that 

77®* (x) • 77®* {y) — X ■ y = —1 (mod rre). 

To do so, it suffices to find nonnegative integers di,... ,ds that solve the equation 


S 

diji = — 1 (mod rre). 

i=l 


(7) 
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Here di represents the number of times the pair {xi,yi) occurs in {x,y). By construction, no pi 
divides 7 ^, and since the p^’s are distinct primes, they have no common factor. This implies that 
gcd ( 71 ,... , 7 s,m) = 1. So by the Chinese Remainder Theorem, a solution to ([7]) exists. ■ 

Note also that, if H maps the all-0 and all-1 strings to themselves, then does so as well. 
To proceed further, it will be helpful to introduce some terminology. Suppose that we have 
two strings x = xi... Xn and y = yi... yn- For each i, the pair Xiyi has one of four possible values: 
00, 01, 10, or 11. Let the type of {x,y) be an ordered triple (a, 6 , c) € which simply records 
the number of occurrences in (x, y) of each of the three pairs 01, 10, and 11. (It will be convenient 
not to keep track of 00 pairs, since they don’t contribute to the Hamming weight of either x or y.) 
Clearly, by applying swaps, we can convert between any pairs {x,y) and {x',y') of the same type, 
provided that x, y, x', y' all have the same length n. 

Now suppose that, by repeatedly applying a gate H, we can convert some input pair (x, y) of 
type (a, 6 , c) into some pair {x',y') of type {a',b',c'). Then we say that H generates the slope 

(a' — a, b' — b, c' — c) . 

Note that, if H generates the slope {p,q,r), then by inverting the transformation, we can also 
generate the slope {—p, —q, —r). Also, if H generates the slope (p, q, r) by acting on the input pair 
(x, y), and the slope (p', g', r') by acting on (x', y'), then it generates the slope (p -|- p^ g -|- g^ r + r') 
by acting on (xx', yy'). For these reasons, the achievable slopes form a 3-dimensional lattice —that 
is, a subset of closed under integer linear combinations—which we can denote £ (H). 

What we really want is for the lattice £ (H) to contain a particular point: (1,1, —1). Once we 
have shown this, we will be well on our way to generating a strong quasi-Fredkin gate. We first 
need a general fact about slopes. 

Lemma 54 Let H map the all-0 input to itself. Then C{H) contains the points {k {H) ,0,0), 
(0, k (H), 0), and (0, 0, k (H)). 

Proof. Recall from Proposition 00] that there exists a t, and an input w, such that {w)\ = 
|t(;| -|- k {H). Thus, to generate the slope (fc (H) , 0, 0), we simply need to do the following: 

• Choose an input pair (x, y) with sufficiently many Xiyi pairs of the forms 10 and 00. 

• Apply to a subset of bits on which x equals w, and y equals the all-0 string. 

Doing this will increase the number of 10 pairs by k (H), while not affecting the number of 01 
or 11 pairs. 

To generate the slope {0,k{H) ,0), we do exactly the same thing, except that we reverse the 
roles of X and y. 

Finally, to generate the slope (0, 0, k {H)), we choose an input pair (x, y) with sufficiently many 
Xiyi pairs of the forms 11 and 00 , and then use the same procedure to increase the number of 11 
pairs by A; (iL). ■ 

We can now prove that (1,1, —1) is indeed in our lattice. 

Lemma 55 Let H be a mod-k {H)-preserving gate that maps the all-0 input to itself, and suppose 
there exist inputs x, y such that 

H{x) ■ H{y) — X ■ y = —I (mod/c {H)). 

Then (1,1, —1) E £ {H). 
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Proof. The assumption implies directly that H generates a slope of the form {p, q, —1 + rk (H)), for 
some integers p, q, r. Thus, Lemma [Ml implies that H also generates a slope of the form {p, q,—l), 
via some gate G G (H) acting on inputs {x,y). Now, since H is mod-A: (Lf)-preserving, we have 
|G(x)| = |x| {modk (H)) and \G {y)\ = \y\ {modk (H)). But this implies that p = 1 (mod/c (LA)) 
and g = 1 (mod A: (H)). So, again using Lemma [5^ we can generate the slope (1,1, —1). ■ 
Combining Lemmas 15211531 and 1551 we can summarize our progress so far as follows. 

Corollary 56 Let G be any non-affine, non-conservative gate. Then either G generates Fredkin, 
or else it generates a gate H that maps the all-0 and all-1 inputs to themselves, and that also 
satisfies (1,1, —1) G £ {H). 

We now explain the importance of the lattice point (1,1, —1). Given a gate Q, let us call Q 
weak quasi-Fredkin if there exist strings a and b such that 

Q(a,01) = (o,01), 

g(6,01) = ( 6 ,10). 


Then: 

Lemma 57 A gate H generates a weak quasi-Fredkin gate if and only if (1,1, —1) G C {H). 

Proof. If H generates a weak quasi-Fredkin gate Q, then applying Q to the input pair (a, 01) and 
(6,01) directly generates the slope (1,1, —1). For the converse direction, if H generates the slope 
(1,1, —1), then by definition there exists a gate Q G (LA), and inputs x,y, such that \Q (x)| = |x| 
and IQ (y)| = \y\, while 

Q {x) ■ Q {y) = X ■ y - 1. 

In other words, applying Q decreases by one the number of 1 bits on which x and y agree, while 
leaving their Hamming weights the same. But in that case, by permuting input and output bits, 
we can easily put Q into the form of a weak quasi-Fredkin gate. ■ 

Next, recall the definition of a strong quasi-Fredkin gate from Section 18.31 Then combining 
Corollary [56] with Lemma we have the following. 

Corollary 58 Let G be any non-affine, non-conservative gate. Then either G generates Fredkin, 
or else it generates a strong quasi-Fredkin gate. 

Proof. Combining Corollary 1561 with Lemma (571 we find that either G generates Fredkin, or else 
it generates a weak quasi-Fredkin gate that maps the all-0 and all-1 strings to themselves. But 
such a gate is a strong quasi-Fredkin gate, since we can let c be the all-0 string and d be the all-1 
string. ■ 

Combining Corollary [58| with Lemmas [49] and [50] now completes the proof of Theorem [51] 
that every non-affine, non-conservative gate generates Fredkin. However, since every non-affine, 
conservative gate generates Fredkin by Theorem 1471 we get the following even broader corollary. 

Corollary 59 Every non-affine gate generates Fredkin. 

Finally, combined with Corollary [42l Corollary [59] completes the proof of Theorem [3^ that 
every non-affine gate set generates either (Fredkin), (Fredkin,NOTNOT), (Fredkin,NOT), (Cfc) 
for some A: > 3, or (Toffoli). 
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9 The Affine Part 


Having completed the classification of the non-affine classes, in this section we turn our attention 
to proving that there are no affine classes besides the ones listed in Theorem [3j namely, the trivial, 
Tg, T 4 , F 4 , CNOTNOT, and CNOT classes, as well as various extensions of them by NOTNOT 
and NOT gates. 

To make the problem manageable, we start by restricting attention to the linear parts of affine 
transformations (i.e., if a transformation has the form G (x) = Ax 0 6 , we ignore the additive 
constant h). We show that the only possibilities for the linear part are: the identity, all mod-4- 
preserving orthogonal transformations, all orthogonal transformations, all parity-preserving linear 
transformations, or all linear transformations. This result, in turn, is broken into several pieces: 

• In Section 19.11 we show that any mod-4-preserving orthogonal gate generates all mod-4- 
preserving orthogonal transformations, and that any non-mod-4-preserving orthogonal gate 
generates all orthogonal transformations. 

• In Section 19.21 we show that every non-orthogonal, parity-preserving linear gate generates 
CNOTNOT. This again requires “slope theory” and the analysis of a 3-dimensional lattice. 
It also draws on the results of Section 16.31 which tell us that it suffices to restrict attention 
to the case k (G) = 2 . 

• In Section [9.31 we show that every non-parity-preserving linear gate generates CNOT. In 
this case we are lucky that we only need to analyze a 1 -dimensional lattice (i.e., an ideal in 
Z) 

Finally, in Section 19.41 we complete the classification by showing that including the affine 
parts can yield only the following additional possibilities: NOTNOT, NOT, F 4 , F 40 NOTNOT, 
F 4 0 NOT, Te 0 NOTNOT, Tg 0 NOT, or CNOTNOT 0 NOT. Summarizing, the results of this 
section will imply the following. 

Theorem 60 Any set of affine gates generates one of the following 13 classes: (0), (NOTNOT), 
(NOT), (Te), (Tg, NOTNOT), (T 6 ,NOT), (T 4 ), (F 4 ), (T 4 , NOTNOT), (T 4 ,NOT), (CNOTNOT), 
(CNOTNOT,NOT), or (CNOT). 

Together with Theorem [39l this will then complete the proof of Theorem [3l 

9.1 The T and F Swamplands 

In this section, we wish to characterize the orthogonal classes. We first need a lemma. 

Lemma 61 A 4 _k +2 generates Tg, and T^k generates T 4 , for all k > 1. 

Proof. We first describe how to simulate Tg (xi.. .xg), using three applications of T 4 fc_|_ 2 . Let 
bx := xi 0 • • • 0 xg. Also, let a be a string of ancilla bits, initialized to Then: 

1. Apply Aik +2 to the string 0^^“‘^xi... xg. This yields xi 0 63 ,,..., xg 0 bx- 
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and apply Ti^k +2 again. This 


2 . 


3. 


Swap out 2k — 2oi the bx bits with the ancilla string a = 0^*^ ^ 
yields 


T4A:+2 bf-\xi © 6,,..., X6 © 6.) = (hf-\ xi... xe) , 

since the number of ‘ 6 ^;’ entries is even. 

Swap the 2k — 2 bits that are now 0 with a = and apply © 4^+2 a third time, 

returns a to and yields 

T4fc+2 Xi . . . Xe) = T4fc+2 Xi © 6 a;, . . . X6 © 6 a;) . 


This 


Thus, we have successfully applied Tg to xi... xg. The same sequence of steps can be used to 
simulate T 4 (xi ... X4) using three applications of T 4 fc. ■ 

We can now show that there is only one nontrivial orthogonal class that is also mod-4-preserving: 
namely, (Tg). 


Theorem 62 Any nontrivial mod-A-preserving linear gate G generates Tg. 


Proof. Let G (x) = Ax, for some A G Then recall from Corollary 1211 that A is orthogonal. 

By Lemma fTHl this implies that A~^ = so G can also generate A"^. 

Let B be the (n + 1) x (n + 1) matrix that acts as the identity on the first bit, and as A on 
bits 2 through n + 1. Observe that acts as the identity on the first bit, and as on bits 
2 through n + 1. Also, since A preserves Hamming weight mod 4, so do B, and B^. By 
Corollary 1221 this implies that each of (B^)’s column vectors must have Hamming weight 1 mod 
4. Furthermore, since A is nontrivial, there must be some column of with Hamming weight 
4fc + 1, for some fc > 1. Then by swapping rows and columns, we can get B^ into the form 

/ 1 0 0 ••• 0 \ 

0 1 — vi — 

0 1 — Vik+i — , 

0 0 - Vik+2 — 

y 0 0 —Vn— j 


where vi,...,Vn are row vectors each of length n — 1. Let dij equal 1 if i 
note that by orthogonality. 


f dij ifi,j<Ak + l, 
I otherwise. 


j or 0 otherwise. Then 


Now let C'^ be the matrix obtained by swapping the first two columns of B^. Then we claim 
that G^B yields a T 4 fc +2 transformation. Since T 4 fc +2 generates Tg by LemmaEIl we will be done 
after we have shown this. 
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We have 


C^B 


/ 0 
1 

1 

0 

V 0 

/ 0 

1 

1 

0 

0 

V 0 


1 

0 

0 

0 

0 

1 

1 

0 

0 

0 


0 ••• 0 \ 

—-t;!— 


- VAk+l — 

- Vik+2 — 



V 0 


Vn j 

1 0 0 0 \ 

10 0 0 
0 0 0 0 

0 10 0 

0 0 0 

0 0 0 1 / 


0 

1 


0 0 ••• 0 \ 

1 0 ••• 0 


^4fc+l ^4fc+2 ■ ■ ■ 


One can check that the above transformation is actually T 4 fc _|_2 on the first 4/c + 2 bits, and the 
identity on the rest. ■ 

Likewise, there is only one orthogonal class that is not mod-4-preserving: namely, (T 4 ). 


Theorem 63 Let G be any nontrivial orthogonal gate that does not preserve Hamming weight mod 
4. Then G generates T 4 . 

Proof. We use essentially the same construction as in Theorem [Ml The only change is that 
Corollary [22] now tells us that there must be a column of with Hamming weight 4A; + 3 for 
some k> 1, so we use that in place of the column with Hamming weight 4A: + 1. This leads to an 
(n + 1) X (n + 1) matrix G^B, which acts as T 4 fc _|_4 on the first 4A: + 4 bits and as the identity on 
the rest. But T 4 fc +4 generates T 4 by Lemma [611 so we are done. ■ 


9.2 Non-Orthogonal Linear Generates CNOTNOT 

In classifying all linear gate sets, our next goal is to show that “there is nothing between orthogonal 
and parity-preserving.” In other words: 

Theorem 64 Let G he any non-orthogonal, parity-preserving linear gate. Then G generates 
CNOTNOT (or equivalently, all parity-preserving linear transformations). 

The main idea of the proof is as follows. Let CPD, or Copying with a Parity Dumpster, be the 
following partial reversible gate: 


CPD (000) = 000, 
CPD (001) = 001, 
CPD (100) = 111, 
CPD (101) = no. 
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In other words, CPD maps xQy to x, x, x0y—copying x, but also XORing x into the y “dumpster” 
in order to preserve the total parity. Notice that CPD is consistent with CNOTNOT; indeed, it is 
simply the restriction of CNOTNOT to inputs whose second bit is 0. Notice also that, whenever 
we have a 3-bit string of the form xxy, we can apply CPD in reverse to get x, 0, x © y. 

Then we will first observe that CPD generates CNOTNOT. We will then apply the theory 
of types and slopes, which already made an appearance in Section 18.41 to show that any non- 
orthogonal linear gate generates CPD: in essence, that there are no modularity or other obstructions 
to generating it. 

Lemma 65 Let G he any gate that generates CPD. Then G generates CNOTNOT (or equiva¬ 
lently, all parity-preserving linear transformations). 

Proof. Let F : {0,1}" ^ {0, !}"■ be any reversible, parity-preserving linear transformation. Then 
we can generate the following sequence of states: 

X ^ X, gar (x), F (x) 

^ X, gar (x), F (x), F (x), |x| (mod 2) 
x,F (x), |x| (mod 2) 
x,F (x), gar {F (x)), x, |x| (mod2) 

—)■ F (x), gar {F (x)), x, |xl + \F (x)| (mod2) 

= F (x), gar {F (x)), x, 0 (mod 2) 

^F(x), 

for some garbage strings gar (x) and gar (F (x)). Here the first line computes F (x) from x; the 
second line applies CPD to copy F (x) (using a single “dumpster” bit for each bit of F(x)); the 
third line uncomputes F (x); the fourth line computes a second copy of x from F (x); the fifth line 
applies CPD in reverse to erase one of the copies of x (reusing same dumpster bit from before); and 
the sixth line uncomputes x. Also, |x| + \F (x)| = 0 (mod 2) follows because F is parity-preserving. 
■ 

So, given a non-orthogonal, parity-preserving linear gate G, we now need to show how to 
implement CPD. 

For the rest of this section, we will consider a situation where we are given an n-bit string, with 
the initial state xyQF~^ (where x and y are two arbitrary bits), and then we apply a sequence of 
F 2 linear transformations to the string. Here we do not assume that ancilla bits initialized to 1 
are available, though ancilla bits initialized to 0 are fine. As a result, at every time step, every bit 
in our string will be either x, y, x © y, or 0. Because we are studying only the linear case here, 
not the affine case, we do not need to worry about the possibilities x © 1, y © 1, etc., which would 
considerably complicate matters. (We will handle the affine case in Section [9.41 1 

By analogy to Section (8^ let us define the type of a string z{x,y) E {0,1}” to be {a,b,c), if 
z contains o copies of x and b copies of y and c copies of x © y. Since any string of type (a, 6, c) 
can be transformed into any other string of type (a, 6, c) using bit-swaps, the type of z is its only 
relevant property. As before, if by repeatedly applying a linear gate G, we can map some string 
of type (a, b, c) into some string of type (o', b', c'), then we say that G generates the slope 

{a! — a, b' — b, c' — c) . 
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Again, if G generates the slope {p,q,r), then G~^ generates the slope i—p,—q,—r). Also, if G 
generates the slope {p,q,r) using the string z, and the slope {p',q',r') using the string z', then 
it generates the slope {p + p', q + q',r + r') using the string zz'. For these reasons, the set of 
achievable slopes forms a 3-dimensional lattice, which we denote L (G) C Z^. Moreover, this is a 
lattice with a strong symmetry property: 

Proposition 66 L (G) is symmetric under all 6 permutations of the 3 coordinates. 

Proof. Clearly we can interchange the roles of x and y. However, we can also, e.g., define x' := x 
and y' := X © y, in which case x' ® y' = y. In the triple (x, y,x (B y), each element is the XOR of 
the other two. ■ 

Just like before, our entire question will boil down to whether or not the lattice £ (G) contains 
a certain point. In this case, the point is (1, —1,1). The importance of the (1, —1,1) point comes 
from the following lemma. 

Lemma 67 Let G be any linear gate. Then G generates CPD, if and only if (1, —1,1) G £ (G). 

Proof. If G generates CPD, then it maps xOy, which has type (1,1, 0), to x, x, x © y, which has 
type (2,0,1). This amounts to generating the slope (1, —1,1). 

Conversely, suppose (1, —1,1) G £ (G). Then there is some gate H G (G), and some string of 
the form z = x“y^ (x © yY, such that 

H {z) = x“+^y^-i (x © yf^^. 

But the very fact that G generates such an H implies that G is non-degenerate, and if G is non¬ 
degenerate, then Lemma |46] implies that, starting from xy0”“^, we can use G to increase the 
numbers of x, y, and x © y simultaneously without bound. That is, there is some Q G (G) such 
that (omitting the 0 bits) 

Q (xy) = x“'y^' (x © yf , 

where a' > a and b' > b and c' > c. So then the procedure to implement CPD is to apply Q, then 
H, then Q~^. ■ 

Thus, our goal now is to show that, if G is any non-orthogonal, parity-preserving linear gate, 
then (1, —1,1) G £ (G). Observe that, if k (G) = 4, then Corollary [2T] implies that G is orthogonal, 
contrary to assumption. By Theorem [191 this means that the only remaining possibility is k (G) = 
2. This has the following consequence for the lattice £ (G). 

Proposition 68 If G is a linear gate with k (G) < 2, then £ (G) contains all even points (i.e., all 
{p, q, r) such that p = q = r = 0 (mod 2 )). 

Proof. By Proposition 1401 we must be able to use G to map to 1110”’“^. Since 0” is mapped 

to itself by any linear transformation, this implies that G can map x0"’“^ to xxx0"’“^, which means 
that it generates the slope (2,0,0). So (2,0,0) G £(G). By Proposition [66l then, £ (G) also 
contains the points (0,2,0) and (0,0,2). But these three generate all the even points. ■ 
Proposition 1681 has the following immediate corollary. 

Corollary 69 Let G be a linear gate with k (G) < 2, and suppose £ (G) contains any point {p, q, r) 
such that p = q = r = 1 (mod 2). Then £ (G) contains (1, —1,1). 
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Thus, it remains only to prove the following lemma. 

Lemma 70 Let G be any parity-preserving, non-orthogonal linear gate. Then L (G) contains a 
point {p, q, r) such that p = q = r = 1 (mod 2 ). 

Proof. In the proof of Theorem [Ml this is the first place where we use the linearity of G in an 
essential way—i.e., not just to deduce that k (G) G { 2 ,4}, or to avoid dealing with bits of the form 
X © 1, ?/ © 1, etc. It is also the first place where we use the non-orthogonality of G, other than to 
rule out the possibility that k (G) = 4; and the first place where we use that G is parity-preserving. 

Let us view G as an n x n matrix over F 2 . Then the fact that G is parity-preserving means 
that every column of G has odd Hamming weight. Also, the fact that G is non-orthogonal means 
that it must have two columns with an odd inner product. Assume without loss of generality that 
these are the first and second columns. Let the first two columns of G consist of: 

a rows of the form 1 , 0 , 
b rows of the form 1,1, 
c rows of the form 0 , 1 , 
d rows of the form 0 , 0 , 

where a, b, c, d are nonnegative integers summing to n. Then from the above, we have that a + 6 
and 6 + c and b are all odd, from which it follows that a and c are even. 

Now consider applying G to the input xy0^~‘^. The result will contain; 

a copies of x, 
c copies of y, 
b copies of X © y. 

This means that we’ve mapped a string of type (1,1, 0) to a string of type (a, c, b), thereby generating 
the slope {a — 1, c — l,b). But this is the desired odd point in £ (G). ■ 

Combining Lemma (651 Lemma (Ml Corollary ( 6 ^ and Lemma [70] now completes the proof of 
Theorem [Ml 

9.3 Non-Parity-Preserving Linear Generates CNOT 

To complete the classification of linear gate sets, our final task is to prove the following theorem. 

Theorem 71 Let G be any non-parity-preserving linear gate. Then G generates CNOT (or equiv¬ 
alently, all linear transformations). 

Recall that COPY is the partial gate that maps xO to xx. We will first show how to use G to 
generate COPY, and then use COPY to generate CNOT. 

Note that since G is linear, it cannot be parity-flipping. So since G is non-parity-preserving, it 
is also non-parity-respecting, and k {G) must be finite and odd. But by Theorem this means 
that k {G) = 1: in other words, G is non-mod-respecting. 

Let z be an n-bit string that consists entirely of copies of x and 0. Let the type of 2 ; be the 
number of copies of x. Clearly we can map any 2 ; to any other 2 ; of the same type using swaps, so 
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the type of z is its only relevant property. Also, we say that a gate G generates the slope p, if by 
applying G repeatedly, we can map some input z of type a to some input z' of type a+p. Note that 
if G generates the slope p, then by reversibility, it also generates the slope —p. Also, if G generates 
the slope p by mapping ztoz', and the slope q by mapping w to w', then it generates the slope p+q 
by mapping zw to z’w'. For these reasons, the set of achievable slopes forms an ideal in Z (i.e., a 
1-dimensional lattice), which we can denote C{G). The question of whether G generates COPY 
can then be rephrased as the question of whether £ (G) contains 1 —or equivalently, of whether 
£(G) =Z. 

Lemma 72 A linear gate G generates COPY if and only if 1 £ C (G). 

Proof. If G generates COPY, then clearly 1 G £(G). For the converse direction, suppose 
1 G £ (G). Then G can be used to map an input of type a to an input of type a + 1, for some a. 
Hence G can also be used to map inputs of type b to inputs of type 6 + 1, for all b > a. This also 
implies that G is non-degenerate, so by Lemma H6l it can be used to increase the number of copies 
of X without bound. So to copy a bit x, we first apply some gate H G (G) to map x to x^ for some 
b> a, then map x^ to and finally apply H~^ to map x^~^^ to ■ 

Now, the question of whether 1 G £ (G) is easily answered. 

Lemma 73 Let G be any non-mod-respecting linear gate. Then £ (G) = Z. 

Proof. This follows almost immediately from Proposition I4UI together with the fact that k (G) = 1. 
We simply need to observe that, if x = 1, then the number of copies of x corresponds to the 
Hamming weight. ■ 

Finally, we show that COPY suffices for CNOT. 

Lemma 74 Let G be any linear gate that generates COPY. Then G generates CNOT. 

Proof. We will actually prove that G generates any linear transformation F. Observe that, if G 
generates COPY, then it must be non-degenerate. Therefore, by copying bits whenever needed, 
and using G to do computation on them, clearly we can map the input x to a string of the form 

x,gar(x),F(x), 

for some garbage string gar (x). Since G generates COPY, we can then make one copy of F (x), 
mapping the above to 

X, gar (x), F (x), F (x). 

Next we can uncompute the computation of F to get 

X, F (x). 

By reversibility, we can then map the above to 

X, F (x), gar (F (x)), x. 

By inverting COPY, we can then implement xx —>■ x, to map the above to 

F (x), gar (F (x)), x. 
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Finally, we can uncompute the computation of x to get F (x) alone. ■ 

Combining Lemmas [7211721 and [71] now completes the proof of Theorem |7TJ Then combining 
Theorems [32l [33l [36l [371 ESI ESI ES and [TH we can summarize our progress on the linear case as 
follows. 

Corollary 75 Every set of linear gates generates either {0), (Tg), (T 4 ), (CNOTNOT), or (CNOT). 

9.4 Adding Back the NOTs 

Now that we have completed the classification of the linear gate classes, the final step that remains 
is to take care of the affine parts. We first give some useful lemmas for manipulating affine gates. 

Lemma 76 NOT®^ generates NOTNOT for all k>l, as well as NOT if k is odd. 

Proof. To implement NOTNOT (x, y), apply NOT®^ to x, oi ... afc_i and then to y, ai... Ok-i- To 
implement NOT(x), let £ := Apply NOT®^ to X, oi ... q,£, 61 ... bi, then x, oi... a/i, ci... C£, 

then x,bi.. .b£,ci ... C£. ■ 

More generally: 

Lemma 77 Let G be any gate of the form NOT . Then G generates NOTNOT. 

Proof. To implement NOTNOT (x,y), first apply G to x,o where a is some ancilla string; then 
apply G~^ to y, a. ■ 

Also: 

Lemma 78 Let G (x) = Ax (Bb be an affine gate. Then G + NOTNOT generates A itself. 

Proof. First we use G®^ to map x, O"" to Ax 0 b, b; then we use NOTNOT gates to map Ax 0 6,6 
to Ax, O"". ■ 

By combining Lemmas [77] and ITHl we obtain the following. 

Corollary 79 (Cruft Removal) Let G (x) = Ax (B b be an n-bit affine gate. Suppose A applies 
a linear transformation A' to the first m bits of x, and acts as the identity on the remaining n — m 
hits. Then G generates an m-bit gate of the form H (x) = A'x 0 c. 

Proof. If = 0 for all i > m, then we are done. Otherwise, we can use Lemma [77] to generate 
NOTNOT, and then Lemma ITS) to generate H (x) = A'x. ■ 

Lemma 80 Let S be any class of parity-preserving linear or affine gates. Then there are no 
classes between (S) and {S -B NOT) other than {S -B NOTNOT). 

Proof. Let G be a transformation that is generated by 5* 0 NOT but not by S. Then we need to 
show how to generate NOT or NOTNOT themselves using S' 0 G. 

We claim that G acts as 

G (x) = F (x) 0 b, 

where V (x) is some parity-preserving affine transformation generated by S, and b is some nonzero 
string. First, V must be generated by S because, given any circuit for G over the set S 0 NOT, 
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we can always push the NOT gates to the end; this leaves us with a circuit for the “S part” of G. 
(This is the one place where we use that S is affine.) Also, b must be nonzero because otherwise, 
G would already be generated by S. 

Given x, suppose we first apply V~^ (which must be generated by S), then apply G. This 
yields 

G (y~^ (x)) = V [V~^ (x)) © 6 = X © 6, 

which is equivalent to NOT®^ for some nonzero k. By Lemma [76l this generates NOTNOT. If 
\b\ is always even, then since V is parity-preserving, clearly we remain within {S + NOTNOT). If, 
on the other hand, | 6 | is ever odd, then again by Lemma [76l we can generate NOT. ■ 

We can finally complete the proof of Theorem 1601 characterizing the possible affine classes. 
Proof of Theorem 1601 If we restrict ourselves to the linear part of the class, then we know from 
Corollary 1751 that the only possibilities are (CNOT), (CNOTNOT), (T 4 ), (Tg), and ( 0 ) (i.e., the 
trivial class). We will handle these possibilities one by one. 

Linear part is (CNOT). Since CNOT can already generate all affine transformations (by 
Theorem I32p . using an ancilla bit initialized to 1, we have (S) C (CNOT). For the other direction. 
Corollary [79] implies that S must generate a gate of the form CNOT (x) © b, for some b G {0,1}^. 
However, it is not hard to see that all such gates can generate CNOT itself. 

Linear part is (CNOTNOT). Here we clearly have {S) C (CNOTNOT,NOT). Meanwhile, 
Corollary [79| again implies that S generates a gate of the form G (x) = CNOTNOT (x)© 6 , for some 
b G {0,1}^. Suppose the first bit of b is 1; this is the bit that corresponds to the control of the 
CNOTNOT. Then G {G (x)) generates NOTNOT, so by Lemma 1781 we can generate CNOTNOT. 
If, on the other hand, the first bit of b is 0, then G generates NOT or NOTNOT directly, so we can 
again use Lemma ITHl to generate CNOTNOT. Therefore (S) lies somewhere between (CNOTNOT) 
and (CNOTNOT, NOT). But since CNOTNOT already generates NOTNOT, Lemma IHOlsavs that 
the only possibilities are (CNOTNOT) and (CNOTNOT,NOT). 

Linear part is (T 4 ). In this case (S) C (T 4 ,NOT). Again Corollary [79| implies that S 
generates a gate of the form G (x) = T 4 (x) © b, for some b G {0,1}^. If 6 = fill, then S generates 
F 4 . So (S) lies somewhere between (F 4 ) and (F 4 ,NOT) = (T 4 ,NOT), but then Lemma IHOl ensures 
that (F 4 ), (F 4 ,NOTNOT) = (T 4 ,NOTNOT), and (T 4 ,NOT) are the only possibilities. Likewise, 
if 6 = 0000, then S generates T 4 , so (T 4 ), (T 4 , NOTNOT), and (T 4 , NOT) are the only possibilities. 

Next suppose | 6 | is odd. Then G {G (x)) = NOT®^ (x), which generates NOTNOT by Lemma 
[761 So by Lemma ITSl we generate T 4 as well. Thus we have at least (T 4 ,NOTNOT). But since 
G itself is parity-flipping, (S') is not parity-preserving, leaving (T 4 ,NOT) as the only possibility by 
Lemma [ 8 OI Finally suppose | 6 | = 2: without loss of generality, b = 1100. Let Q be an operation 
that swaps the first two bits of x with the last two bits. Then G {Q {G {x))) is equivalent to 
NOT®^ (x) up to swaps, so again we have at least (T 4 , NOTNOT), leaving (T 4 , NOTNOT) and 
(T 4 ,N 0 T) as the only possibilities. 

Linear part is (Tg). In this case (S) C (T 6 ,NOT). Again, Corollary 1791 implies that 
S generates G(x) = Tg (x) © b for some b G {0,1}®. If 6 = 000000, then S generates Tg, so 
(Tg), (Tg,NOTNOT), and (Tg,NOT) are the only possibilities by Lemma [ 8 OI If | 6 | is odd, then 
G {G (x)) = NOT®® (x). By Lemma[76l this means that S generates NOTNOT, so by Lemma [751 it 
generates Tg as well. But G is parity-flipping, leaving (Tg, NOT) as the only possibility by Lemma 
[ 8 OI If | 6 | is 2 or 4, then by an appropriate choice of swap operation Q, we can cause G {Q {G (x))) 
to generate NOTNOT, so again (Tg, NOTNOT) and (TgjNOT) are the only possibilities. 
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Finally, if 6 = 111111, then G{x) = Fg (x). In this case we start with the operation 


Fg (xOOOOO) = Ixxxxx 

Using three of the x outputs and three fresh 0 ancilla bits, we then perform 

Fg (x^OOO) = lllxxx 

Next, bringing the xxx outputs together with the remaining xx outputs and one fresh 0 ancilla bit, 
we apply 

Fg (xxxxxO) = lllOOx 

In summary, we have performed a NOT (x) operation with some garbage still around. However, 
if we repeat this entire procedure 6 times, then the Hamming weight of the garbage will be a 
multiple of 6. We can remove all this of garbage using the Fg gate. Therefore, we have created 
a NOT'^® gate, which generates NOTNOT by Lemma [76l So again we can generate Tg, leaving 
(Tg,NOTNOT) and (Tg,NOT) as the only possibilities by Lemma ITHl 

Linear part is (0). In this case (S) C (NOT), so Lemma IHOl implies that the only possibilities 
are (0), (NOTNOT), and (NOT). ■ 

10 Open Problems 

As discussed in Section [H the central challenge we leave is to give a complete classification of all 
quantum gate sets acting on qubits, in terms of which unitary transformations they can generate 
or approximate. Here, just like in this paper, one should assume that qubit-swaps are free, and 
that arbitrary ancillas are allowed as long as they are returned to their initial states. 

A possible hrst step—which would build directly on our results here—would be to classify all 
possible quantum gate sets within the stabilizer group, which is a quantum generalization of the 
group of affine classical reversible transformations. Since the stabilizer group is discrete, here 
at least there is no need for representation theory. Lie algebras, or any notion of approximation, 
but the problem still seems complicated. A different step in the direction we want, which would 
involve Lie algebras, would be to classify all sets of 1- and 2-qubit gates. A third step would be to 
classify qubit Hamiltonians (i.e., the infinitesimal-time versions of unitary gates), in terms of which 
n-qubit Hamiltonians they can be used to generate. Here the recent work of Cubitt and Montanaro 
[^, which classihes qubit Hamiltonians in terms of the complexity of approximating ground state 
energies, might be relevant. Yet a fourth possibility would be to classify quantum gates under the 
assumption that intermediate measurements are allowed. Of course, these simplifications can also 
be combined. 

On the classical side, we have left completely open the problem of classifying reversible gate sets 
over non-binary alphabets. In the non-reversible setting, it was discovered in the 1950s (see |18] i 
that Post’s lattice becomes dramatically different and more complicated when we consider gates 
over a 3-element set rather than Boolean gates: for example, there is now an uncountable infinity of 
clones, rather than “merely” a countable infinity. Does anything similar happen in the reversible 
case? Even for reversible gates over (say) {0,1,2}"', we cannot currently give an algorithm to 
decide whether a given gate G generates another gate H any better than the triple-exponential- 
time algorithm that comes from clone theory, nor can we give reasonable upper bounds on the 
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number of gates or ancillas needed in the generating circuit, nor can we answer basic questions like 
whether every class is finitely generated. 

Finally, can one reduce the number of gates in each of our circuit constructions to the limits 
imposed by Shannon-style counting arguments? What are the tradeoffs, if any, between the number 
of gates and the number of ancilla bits? 
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12 Appendix: Post’s Lattice with Free Constants 

For completeness, in this appendix we prove a ‘quick-and-dirty’ version of Post’s 1941 classification 
theorem [22], for sets of ordinary (non-reversible) Boolean logic gates. 

Theorem 81 (Post’s Lattice Lite) Assume the constant functions / = 0 and f = 1, as well as 
the identity function f (x) = x, are available for free. Then the only Boolean clones (i.e., classes of 
Boolean functions f : {0,1}” —>■ {0,1} closed under composition and addition of dummy variables) 
are the following: 

1. The trivial class (containing the constant and identity functions). 

2. The AND class. 
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3. The OR class. 


4- The class of monotone functions (generated by {AND,OR}j. 

5. The NOT elass. 

6. The class of affine functions (generated by {XOR, NOTjj. 

7. The class of all Boolean functions (generated by {AND, NOT} 

Proof. We take it as known that {AND, OR} generates all monotone functions, {XOR, NOT} 
generates all affine functions, and {G, NOT} generates all functions, for any 2-bit non-affine gate 
G. 

Let C be a Boolean clone that contains the constant 0 and 1 functions. Then C is closed under 
restrictions (e.g., if / (x, y) G C, then / (0, y) and / (x, 1) are also in C), and that is the crucial fact 
we exploit. 

First suppose C contains a non-monotone gate. Then certainly we can construct a NOT gate 
by restricting inputs. If, in addition, C contains a non-affine gate, then by Proposition 031 we 
can construct a 2-bit non-affine gate by restricting inputs: AND, OR, NAND, NOR, IMPLIES, or 
NOT (IMPLIES). Together with the NOT gate, this puts us in class 7. If, on the other hand, C 
contains only affine gates, then as long as one of those gates depends on at least two input bits, 
by restricting inputs we can construct a 2-bit non-degenerate affine gate: XOR or NOT (XOR). 
Together with the NOT gate, this puts us in class 6. If, on the other hand, every gate depends on 
only 1 input bit, then we are in class 5. 

Next suppose C contains only monotone gates. Clearly the only affine monotone gates are 
trivial. Thus, as long as one of the gates is nontrivial, it is non-affine, so Proposition 04] again 
implies that we can construct a non-affine 2-bit monotone gate by restricting inputs: AND or OR. 
If we can construct only AND gates, then we are in class 2; if only OR gates, then we are in class 
3; if both, then we are in class 4. If, on the other hand, every gate is trivial, then we are in class 
1 . ■ 

The simplicity of Theorem [ST] underscores how much more complicated it is to understand 
reversible gates than non-reversible gates, when we impose a similar rule in both cases (i.e., that 0 
and 1 constant or ancilla bits are available for free). 

13 Appendix: The Classification Theorem with Loose Ancillas 

Theorem 82 Under the loose ancilla rule, the only ehange to Theorem 0 is that every C -|- 
NOTNOT class collapses with the corresponding C + NOT class. 

Proof. That this collapse happens is clear: under the loose ancilla rule, we can always simulate 
a NOT gate by applying a NOTNOT gate to the desired bit, as well as to a “dummy” ancilla bit 
that will never be used for any other purpose. 

To see that no other collapses happen, we must show that the remaining classes are distinct. 
Under the usual ancilla rule, the classes are distinct because for any pair of classes we can find an 
invariant satisfied by one, but not the other, to separate the two. We would like to do the same 
for loose ancilla classes, but invariants under the usual rule need not, a priori, be invariants under 
the loose ancilla rule. More concretely, as we have seen, a gate set that preserves parity under the 
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usual rule need no longer preserve it under the loose ancilla rule. However, we claim that all the 
other invariants are also loose ancilla invariants. 

Suppose G {x, a) = {H (x) ,b) is a transformation generated under the loose ancilla rule, where 
a and b are constants, so that under the loose ancilla rule, we have also generated H. We would 
like to show that any invariant of G must also hold for H, so let us consider the invariants one by 
one. 


• If G is mod-fe-respecting then 


|G(x)| — |x| = \H (x)| — |x| + |a| — |6| , 


is constant modulo k, and hence \H (x)| — \x\ is constant modulo k, so H (x) is mod-A:- 
respecting. For k > 3, mod-A;-respecting is equivalent to mod-Zc-preserving by Theorem [T2j 
When k = 2, we have already seen that NOTNOT collapses with NOT. 

• If G is conservative then 0 = |G (x)| — |x| = \H (x)| — |x| + \a\ — |6| as above. If we average 
over all x and appeal to reversibility, then we see that \a\ — |6| must be 0, and hence H is 
conservative. 


• If G is affine then 


G 



/Mil Mi2\ (x\ (ci\ 
\M21 M22j\aj^\c2j 



so clearly H (x) = M^x + Mi 2 a + ci is affine as well. Since M 21 X + M 22 a + C 2 = b for all x, 
we must have M 21 = 0. But this means that if the columns of 

/ Mil ^^12^ 

VM21 M22J’ 


the linear part of G, have weight 2, weight 4, or are orthogonal, then the same is true of 
columns of 

(t). 

and hence the columns of Mu itself. In short, if the linear part of G has any of the properties 
we are interested in, then so does the linear part of H. 

• If G is orthogonal then ci = 0 and C 2 = 0. Recall that M 21 = 0, and since a matrix of the 
form 

A B 
0 G 

has an inverse of the same form, and the inverse of an orthogonal matrix is its transpose, we 
see that M 12 = 0. It follows that H {x) = Mux + M 120 + ci is actually just H (x) = Mux 
when G is orthogonal, therefore H is orthogonal because Mu is orthogonal. 
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14 Appendix: Number of Gates Generating Each Glass 


In this appendix, we count how many n-bit gates belong to each of the classes of Theorem [3l Let 
us write {G)^ for the set of ra-bit gates generated by G, and # (G)^ for the number of n-bit gates 
generated by G. Then Theorem 1831 gives the exact number of gates in each class, while Theorem [87l 
gives the asymptotics. 

Theorem 83 Let n > 1 be an integer. 


• The total number of gates is 

#(Toffoli)„ = (2-)! 

and the non-affine elasses break down as follows. For k > 3, 

#(Fredkin,NOT)„ = 2 (2”-^)^ 
#(Fredkin,NOTNOT)„ = (2"“^)^ 


k—1 ( 

#( 0 t>„=n E 

i=0 yj—i=0(modfc) 


#(Fredkin)„ = ^ (Q') 



I 


• The total number of affine gates is 

n 

if (CNOT)^ = 

i=l 

• The numbers of parity-preserving and parity-respecting gates are: 


# (CNOTNOT)^ 
#(CNOTNOT,NOT)„ 


n—1 

2 'n-(n-\-l)/ 2 -l ^2* — 

i=l 

n—1 

2^(n+l)/2 ^ 2 * - l) 

i=l 


• The numbers of gates in ( 0 ), (Tg), and (T 4 ) are: 


# (0)n = 


#(T4)„ 


2™" (2^* -1), 

2™' n.=i (2^* -1), 


if n = 2m 
if n = 2m + 1 


#(T6)„ 


'1 

2irr? + l j^2m^ ( 2 ‘ii _ l) 

, 24m^+27n+l (2‘2m+l ]^2m^ (2^* - l) 

24m^—2m-|-l ^22™.—1 ^ 1)™) j^2™’~2 ^22* 

24m^—2m-|-l ^ 1)™) ^2^* 


if n = 1 
if n = 4m -|- 2 
if n = 4m -|- 3 
if n = 4m > 4 
if n = 4m -|- 1 > 5 
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Furthermore. 


#(F4)„ = #(T4),. 


• For any linear elass (G) we have 


#(G,N0T)„ = #(G)„2- 
#(G,N0TN0T)„ = #(G)„2-i 

Let us count each class in turn. To start, note that an n-bit reversible gate is, by definition, a 
permutation of {0,1}"", so there are (2"')! gates in total. 

Parity-preserving gates map even-weight strings to even-weight strings, and odd-weight strings 
to odd-weight strings. It follows that there are ((2”“^)!) parity-preserving gates. Clearly there 
are exactly twice as many parity-respecting gates, since we can append a NOT gate to any parity¬ 
preserving gate to get a parity-flipping gate, and vice versa. 

The mod-Zc-preserving gates (for k > 3) also decompose into a product of permutations, one for 
each Hamming weight class modulo k. This leads to the formula 

#n)„=nf E ("))' 

i=0 yj—i=0(mod fc) ^ / 

Likewise, for conservative gates, we have 

#(Fredkin)„ = fl ((i)') • 

The linear part of an affine gate is an n x n invertible matrix A. The number of such matrices 
is well-known to be 

n—1 n 

(2*" - 2*) = ('2* _ _ 
i=\ 

There are an additional 2” choices for the affine part, so 

n 

# (CNOT)„ = n{2 ‘-i)^ 

i=\ 

A parity-preserving affine transformation is an affine transformation on the (n — l)-dimensional 
subspace of even Hamming-weight vectors, extended to the entire space by defining the transforma¬ 
tion on any odd-weight vector. There are (2* ~ l) affine transformations on n — 1 

dimensions and 2"'“^ choices of odd-weight vector, so there are 

n—1 

# (CNOTNOT)^ = n { 2 ‘ - 1 ) 

i=\ 

parity-preserving affine transformations, and twice as many parity-respecting affine transforma¬ 
tions. 


60 



We refer to MacWilliams [20] for the formula (below) for the number of orthogonal n x n 
matrices. 


#(T4)„ 


2 ”^" ( 2 ^* - 1 ) , if n = 2 m, 

2 ”^" nl^i ( 2 ^* - 1 ) , if n = 2 m + 1 . 


We now turn our attention to counting (Tg)^, which is more involved. The approach will be similar 
to that of MacWilliams |20|. It will help to consider (T 4 )^ and (Te)^ as groups. Indeed, (T 4 )^ is 
just the orthogonal group 0 (n) over F 2 , and (Te)^ is a proper subgroup. 

The idea is to find a unique representative for each of the cosets of (Te)^ in (T 4 )^. Since we 
know by [ 20 ], dividing by the number of unique representatives will give us # (Tg)^ as 

desired. 

Recall that by Lemma [T 6 l the Hamming weight of each column vector of an orthogonal matrix 
is either lmod4 or 3mod4. If H G (T 4 )^ is an orthogonal matrix with column vectors oi,... , 0 ^, 
then the characteristic vector c{A) is an n-dimensional vector whose i^^ entry, Ci{A), is defined as 
follows: 


Ci (^) 


1 if I Oil = 3 (mod 4) 
0 if I Oil = 1 (mod 4) 


The following lemma shows that these characteristic vectors can be used as a representatives for 
the cosets of (Tg)^. 


Lemma 84 Two orthogonal transformations, A,B& (T 4 )„, are in the same coset of (Tg)^ if and 
only if c (A) = c (B ). 

Proof. Note that A and B are in the same coset if and only if T := BA~^ = BA^ is in (Tg)^. 
We know that T G (T 4 )^, and that T (a*) = bi for all i. Since oi,..., a„, is an orthogonal basis, 
Theorem 1371 savs that T G (Tg)^ if and only if T is mod-4-preserving. By Theorem 1201 this holds 
if and only if |aj| = \bi\ (mod4) for all i, or equivalently, c{A) = c{B). ■ 

Lemma [841 shows that it suffices to count the number of possible characteristic vectors. Perhaps 
surprisingly, not every characteristic vector is achievable; the following lemma shows exactly which 
ones are. 


Lemma 85 If A £ (T 4 )^, then |c(H)| = 0(mod4). Furthermore, for every characteristic vector 
c such that |cl = 0(mod4), there exists a matrix A G (T 4 )^ such that c{A) = c. 


Proof. Let A G (T 4 )^ with column vectors ai, ..., a„. Of course, A might not preserve Hamming 
weight mod 4. The main idea of the proof is to promote A to an affine function / (x) = Ax © b 
that does preserve Hamming weight mod 4. We know that such a function exists because we can 
decompose A into a circuit of T 4 gates by Theorem 1361 Replacing each such gate with F 4 will 
yield a circuit of the desired form that preserves Hamming weight mod 4. 

Recall from Theorem 1201 that if / preserves Hamming weight mod 4, then \ai\ + 2 (ai-b) = 
1 (mod 4). Expanding out this condition we get 


ai-b = 


1 (mod 2 ) 
0 (mod 2 ) 


if I© 
if I© 


3 (mod 4) 

1 (mod 4) ’ 
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which is equivalent to the condition A^b = c{A). Therefore, 


| 6 | = \Ac{a)\ 


n 

aiCi (^) 

i=l 


n n 

(A) |aj|+2yy Ci (A) Cj (A) {ai ■ aj) = Q (A) |ajl = 3 |c(a)| (mod4), 

i=l i<j 2=1 


which implies that | 6 | = |c( 74 )| (mod4). But we know by Theorem 1201 that \b\ = 0(mod4). So 
|c(^)| = 0 (mod4), which completes the first part of the lemma. 

We now need to show that any characteristic vector of Hamming weight divisible by 4 is realized 
by some matrix A € Notice that c(T 4 ) = (1,1,1,1). Therefore, by taking an appropriate 

tensor product of T 4 gates and permuting the rows and columns, we can achieve any characteristic 
vector of Hamming weight divisible by 4. ■ 


Corollary 86 # (F 4 )„ = # (T 4 )„. 

Proof. The condition A^b = c{A) in the proof of Lemma [851 implies that there is a unique vector 
b = Ac{A) such that f{x) = A{x) 0 6 is mod-4-preserving. ■ 

Combining Lemmas [M] and [85l we find that the number of representatives for the cosets of 
(Tg)^ in (T 4 )^ equals the number of n-bit strings with Hamming weight 4. An explicit formula 
for this quantity is given by Knuth [161 p. 70]. This now completes the proof of Theorem 1831 
Table [U gives the number of generators of each class for 3 < n < 7. 



n = 3 

n = 4 

n = 5 

n = 6 

n = 7 

(Toffoli) 

37,980 

20,919,528,228,864 

2.6313 X lO^'^ 

1.2689 X lO*^'^ 

3.8562 X 10^™ 

(Fredkiii, NOT) 

480 

1,625,691,648 

4.3776 X 10^6 

6.9238 X 10™ 

1.6100 X 10°™ 

(Fredkiii, NOTNOT) 

450 

1,624,862,256 

4.3776 X 10^6 

6.9238 X 10™ 

1.6100 X 10^™ 

(C 3 ) 

36 

9,953,280 

5.7818 X 10^1 

2.9340 X 10®° 

5.1283 X 10°®° 

(C 4 ) 

0 

414,696 

6.6368 X IQi* 

5.1015 X 10®° 

1.2863 X lO^^^^ 

(Cs) 

0 

0 

1.8962 X 10^'^ 

1.0352 X 10®° 

1.1760 X 10°°° 

(Ce) 

0 

0 

0 

2.1567 X 10^° 

4.4602 X 10°2® 

(C 7 ) 

0 

0 

0 

0 

7.0797 X 10°™ 

(Fredkiii) 

30 

414,696 

1.8962 X IQi'^ 

2.1567 X 10^° 

7.0797 X 10°™ 

(CNOT) 

1152 

301,056 

309,657,600 

1,269,678,735,360 

20,807,658,944,593,920 

(CNOTNOT, NOT) 

72 

10,368 

5,149,440 

10,238,607,360 

82,569,982,279,680 

(CNOTNOT) 

72 

10,368 

5,149,440 

10,238,607,360 

82,569,982,279,680 

(T4,N0T) 

0 

192 

9600 

691,200 

90,316,800 

(T 4 , NOTNOT) 

0 

144 

8400 

648,000 

87,494,400 

(T6,N0T) 

0 

0 

0 

23,040 

2,257,920 

(Tg, NOTNOT) 

0 

0 

0 

22,320 

2,222,640 

(T4),(F4) 

0 

24 

600 

21,600 

1,411,200 

(Te) 

0 

0 

0 

720 

35,280 

(NOT) 

24 

192 

1920 

23,040 

322,560 

(NOTNOT) 

18 

168 

1800 

22,320 

317,520 

(0) 

6 

24 

120 

720 

5040 


Table 1: Number of n-bit generators for each reversible gate class. 
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Theorem 87 The asymptotic size of each reversible gate class is as follows. 


2*^ r? 1 

log 2 # (Toffoli)^ ^ “ br2 2 2 0 { 2 ~"') 

on 

log 2 # (Fredkin, NOTNOT)^ = 712*^ - — - 2” + n log 2 n + log 2 vr + 0 ( 2 -^) 
log 2 # (Fredkin, NOT)„ = log 2 # (Fredkin, NOTNOT)^ + 1 

on 

logs # {Ck)n = k + o(2”) 

TTp.*/^ 

log 2 # (Fredkin)^ = ^2" - — - 2" log 2 ^ + o(2-) 

log2 # (CNOT)„ = „ (n + 1) - a + 0(2“") 

log 2 # (CNOTNOT, NOT)„ = n (n - 1) - a + 0(2-") 

log2 # (CNOTNOT)^ = log2 # (CNOTNOT, NOT)^ - 1 

n 1 /1 \ 

log 2 # (0)„ = n log 2 ” “ + 2 + O ( - 1 


log2 # (T4)„ = - /9 + 0(2-“) 

log2 # (T 6 )„ = - D + o(2-”-'"), 


where 


a = -^log2(l-2-')« 1.7919, 

2 = 1 
OO 

/3 = - ^ log 2 (l - 2-2*) « 0.53839. 
2 = 1 


Recall that The asymptotics of the remaining affine classes follow from the 

rules 


logs # {G, NOT), = n + logs # (G)„ , 
logs # (G, NOTNOT), = n - 1 + logs # (G)„ , 


where (G) is a linear class. 


Proof. Most of these results follow directly from Theorem [83] with liberal use of well-known 
logarithm properties, especially Stirling’s approximation: 

TTl 1 

log 2 (m!) = m logs ^ ~ ^ + 2 27rm -|- O 

For the affine classes, we use the fact that 



E‘<>e2(2‘-i) 

2=1 


m{m + 1) 
2 


+ ^ log2(l — 2 *) 
2=1 


m{m -|- 1) 
2 


0 + 0(2-”*) 
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where a = — log 2 (l — 2“*). Note that a = — log 2 (1/2; 1/2)^ where (l/2;l/2)oo is the 
q-Pochhammer symbol. Similarly, /3 := — I 0 S 2 (l — 2“^*) = — log 2 (1/4; 1/4)^ differs from 
the partial sum by 0 ( 2 “^”*). 

It turns out that the even and odd cases of # (T 4 ) have the same asymptotic behavior, and 
similarly for the four cases of ^ (Tg). 

However, there are two special cases that require extra care: (Cfc) (for k > 3) and (Predkin). 
Recall that 

fc-i 

i=0 

where we define ai = l^j=j(modA:) (])■ Clearly a* = ^ (1 + o(l)). Then Stirling’s approximation 
gives 


k-l 


log 2 # (Ck}n = ^ («* log 2 


i=0 

k-l 


i=0 


X (1 + 0(1)) “ 


= n2^- — -2^1og2k + o(2^). 

In 2 

For (Fredkin), we use the fact that if x is a uniformly-random n-bit string, then the entropy of 


X IS 




i=0 


One can show this by approximating the binomial with a Gaussian distribution. Rearranging gives 
us 


i=0 


E : logj" =..2”-2”log. 


-irewn 


-O — 


n 


Now we can apply Stirling’s approximation to # (Fredkin)^, as calculated in Theorem [83) 


log 2 # (Predkin)^ = ^ 


i=0 L 


n 


.• log2 • - 


n 


n 


+ o 


= „2«_ZL_2",og,^Sf-»(2”), 


One can clearly see “the pervasiveness of universality” in Table [T) within almost every class, 
the gates that are universal for that class quickly come to dominate the gates that are not universal 
for that class in number. Theorem 1871 lets us make that observation rigorous. 


Corollary 88 Let C he any reversible gate class, and let G be an n-bit gate chosen uniformly at 
random from C. Then 


Pr [G generates C] = l — O {2 "■) , 
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unless C is one of the 
which case 


‘NOT classes” ^(Fredkin,NOT), (F 4 ,NOT), (T 6 ,NOT), or (NOT);, in 
Pr [G generates C] = ^ — O (2“"') . 


15 Appendix: Alternate Proofs of Theorems [12] and [19 


Alternate Proof of Theorem 1121 Suppose j ^ 0(mod/i;), and let q be j’s order mod k (that 
is, the least positive i such that ij = 0(modA:)). We first show that q must be a power of 
2. For i € {0, — 1}, let Si be the set of all x G {0,1}”" whose Hamming weight satisfies 

|xl = ij (mod A:). Let q be the number of distinct Sfs. Now, since the gate G maps everything in 
Si to have 

on 

| 5 o| = --- = |Vil = —• 

q 

But the above must be an integer. 

Observe that, if there existed a G such that |G'(x)| = |x| + j (mod A:), where j’s order mod k 
was any positive power of 2 (say 2 ^), then the iterated map would satisfy 


G^^ ^ {x) =\x\ + - (mod A:) 


and so would have order exactly 2 mod k. For that reason, it suffices to rule out, for all A: > 2 and 
all n, the possibility of a reversible transformation G that satisfies 


\G (x)l = \x\+k (mod 2k) 


for all X € {0,1}”. 

To do the above, it is necessary and sufficient to show that there is a “cardinality obstruction” 
to any G of the required form. In other words, for all j G {0,..., 2A; — 1}, let 

An,j '■= {x G {0,1}” : |x| = j (mod2A:)} 

be the set of n-bit strings of Hamming weight j mod 2k. Then the problem boils down to showing 
that for all A: > 2 and n, there exists a j < k such that \Anj\ 7 ^ \Anj+k\ —and therefore, that no 
mapping from Anj to Anj+k (or vice versa) can be reversible. 

This, in turn, can be interpreted as a statement about binomial coefficients: for all A: > 2 and 
all n, there exists a j such that 


E 

*=Li+2fcJ+4fe,... 



A nice way to prove the above statement is by using what we call the wraparound Pascal’s triangle 
of width 2k: that is, Pascal’s triangle with a periodic bonndary condition. This is simply an 
iterative map on row vectors (oq, • • •, a 2 A:-i) £ obtained by starting from the row ( 1 , 0 ,..., 0 ), 
then repeatedly applying the update rule a' := Oj + a{i-i)mod 2 fc for all i. So for example, when 
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2A: = 4 we obtain 


10 0 0 
110 0 
12 10 
13 3 1 

2 4 6 4 

6 6 10 10 
16 12 16 20 


It is not hard to see that the entry of the row of the above “triangle,” encodes that 

is, the number of n-bit strings whose Hamming weights are congruent to i mod 2k. 

So the problem reduces to showing that, when k >2, no row of the wraparound Pascal’s triangle 
of width 2k can have the form 

(ao,..., afc_i, oq, ..., afc-i) • 

That is, no row can consist of the same list of k numbers repeated twice. (Note we can get rows 
that satisfy = Uj+fc for specific values of i: to illustrate, in the width-4 case above, we have 
= 03 = 4: in the fifth row, and uq = 02 = 16 in the seventh row. But we need to show that no 
row can satisfy ai = Oj+fc for all i G {0,..., /c — 1} simultaneously.) We prove this as follows. 

Notice that the update rule that defines the wraparound Pascal’s triangle, namely o' := a* -|- 
O(j-i) mod 2 A:) just a linear transformation on corresponding to a 2/c x 2k band-diagonal matrix 
M. For example, when k = 2 we have 


/1 

1 

0 


0 

1 

1 

0 

0 

0 

1 

1 

V 1 

0 

0 

1 / 


Notice further that rank(M) = 2/c — 1. The image of M is a (2/c — l)-dimensional subspace 
P < (the “parity-respecting subspace”), which is defined by the linear equation 


OQ + O 2 + • • • + 02fc-2 — Ol + 03 + • • • + 02A:-1- 

Thus, M acts invertibly, as long we restrict to vectors in P. 

Next, let D < (the “duplicate subspace”) be the /c-dimensional subspace defined by the k 
linear equations 

a-o = o-k, ■ ■ ■, Ofc-i = a2fc-i- 

Then let S' = P H H be the (/c — l)-dimensional intersection of the parity-respecting and duplicate 
subspaces. 

Observe that S is an invariant subspace of M: that is, if x G S, then Mx G S. But now, 
using the fact that M acts invertibly within P, this means that the converse also holds: namely, 
if X G P \ S, then Mx £ P \ S. In other words: as we generate more and more rows of the 
wraparound Pascal’s triangle, if we’re not already in S by the second row (i.e., after the first time 
we’ve applied M), then we’re never going to get into S. 

Now, the first row of the wraparound Pascal’s triangle is (1,0,... ,0), and the second row is 
(1,1,0,... , 0). This second row is not in S unless k = 1. ■ 
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Alternate Proof of Theorem 1191 We will actually prove a stronger result, that if G is any 
nontrivial affine gate that preserves Hamming weight mod k, then either k = 2 or k = 4. We 
have G (x) = Ax © b, where A is an n x n invertible matrix over F 2 , and b G F^. Since G 
is nontrivial, Lemma [18] implies that at least one of A’s column vectors vi,...,Vn must have 
Hamming weight at least 2; assume without loss of generality that vi is such a column. Notice 
that |G(0”)| = |6| = 0(modA:), while 

|G (ei)l = |ui © 61 = 1 (mod A:) 

Clearly |G(ei)| = |ei| = 1 (modA:). Let y be an n-bit string whose first bit is 0. Then by 
Lemma [T71 we have 


1 + I2/I = |ei © y\ 

= |G(ei © y)\ 

= |G(ei)©G(y)©6| 

= |Aei©6©6©G(y)| 

= |ui ®G{y)\ 

= |ui| + \G{y)\ -2{vi ■G{y)) 

= |ui| + \y\ -2{vi • G(?/)) (modA) . 


Thns 


2{vi- G (y)) = |ui| — 1 (mod A). 


Note that the above equation mnst hold for all 2"' ^ possible y’s that start with 0. 
course, account for half of all n-bit strings. So we dednce that 


Snch ?/’s, of 


1 


H" • x) = Iml - 1 (mod A)] > 2 


Equivalently, if we let S be the set of all x G {0, snch that 2 
find that 


Pr 


e -S'] > i. 


x 


ui I — 1 (mod A), then we 


( 8 ) 


or IS"! > Bnt we will prove this impossible. 

First suppose A is even. Then for the inequality dSj) to have any chance of being satisfied, |ui| 
needs to be odd, so assume it is. Then S eqnals the set of all x G {0,such that 


a:l = 0 



(9) 


If A = 2, then |a:| = O(modl) holds for all x, while if A = 4, then |x| = 0(mod2) holds whenever 
|x| is even. In either case, ([8]) is satisfied. On the other hand, suppose A > 6. Then we claim 
that ([8|) cannot hold; in other words, that IS"! < To prove this, let 

S’' := {x © ei : X G S} 


contain, for each x G S, the string x' obtained by flipping the first bit of x. Then clearly IS"! = IS"!, 
and S and S' are disjoint (since no two elements of S are neighbors in the Hamming cube). So 
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it suffices to show that S Li S' still does not cover all of {0,1} Since | > 3, observe that 
S' can contain at most one string of Hamming weight 1, namely x' = 10 • • • 0 (the neighbor of 
X = old'll). But since |ui| > 2, there are other strings of Hamming weight 1, not included in S'. 
Hence S U 5' / {0,1} 

Next suppose A: > 3 is odd. Then first, we claim that we cannot have |ui| = 2. For suppose 
we did. Then |6 0 ui| would be either \b\, or |6| — 2, or \b\ + 2. But this contradicts the facts that 
|6| = 0(mod/c), while |6©ui| = l(mod/c). Since |ui| ^ 1, this means that |ui| > 3. But in that 
case, we can use a similar argument as before to show that ([8]) cannot hold, and that |5| < 

Letting S' be as above, we again have that |5| = |5'|, and that S and S' are disjoint. And we will 
again show that S Li S' fails to cover all of {0,1}^’'^L Notice that, since the Hamming weights of 
the S elements are separated by A: > 3, every S' element that is “below” an S element must start 
with 0, and every S' element that is “above” an S element must start with 1. Also, since |ui| > 3, 
there must be some x' E S' with a Hamming weight that is neither maximal nor minimal (that 
is, neither |ui| nor 0). But since the first bit of x' has a fixed value, not all strings of Hamming 
weight \x'\ can belong to S'. Hence SU S' ^ {0,1}^’^^', and |5| < ■ 
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